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Preface 


The present book is based on lectures given by the author oyer 
a number of years to students of various colleges studying engineering 
and physics. The book includes some optional material which can 
be skipped for the first reading. The corresponding items in the 
table of contents are marked by an asterisk. 

In designing this course the author tried to select the most impor- 
tant mathematical facts and present them so that the reader could 
acquire the necessary mathematical conception and apply mathema- 
tics to other branches of science. Therefore in most cases the author 
did not give rigorous formal proofs of the theorems and intentionally 
simplified their statements referring the reader to characteristic 
particular cases and obvious examples. The rigorousness of a proof 
often fails to be fruitful and therefore it is usually ignored in practical 
applications. Some purely mathematical stipulations are made in 
the book only in the cases when they help the reader to avoid mis- 
conception in theory and application. Mathematical facts and 
objects which can be regarded as exceptional from the point of 
view of applied science are not even mentioned in the book. (For 
instance, when we speak about “all functions” we do not include 
the functions which are not Lebesgue measurable and even such 
functions as the everywhere discontinuous Dirichiet function and 
the like.) We tried to demonstrate the meaning of the basic mathe- 
matical concepts and to give a convincing explanation of the most 
important mathematical facts on the basis of intuitive notions. 
It is the author’s belief that in applied mathematics an explanation 
of this kind should be regarded as a proof. Such an approach is 
characteristic of applied mathematics whose main purpose is to 
provide an adequate qualitative description of a phenomenon and 
obtain the numerical solution of the corresponding problem in "the 
most economical manner without exerting unnecessary effort. This 
approach essentially differs from that of pure mathematics whose 
corner-stone is the logical consistency of all the considerations based 
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only on the concepts which have an exhaustive logical foundation. 
The author is sure that it is the aspects of applied mathematics that 
must determine the character of mathematical education of an 
engineer and physicist. (But of course a teacher of mathematics 
should have a good command both of pure and applied mathematics.) 

These ideas of the author concerning mathematical education 
(represented in greater detail in his article on applied mathematics 
published in the journal Vestnik Vysshei Shkoly, 1967, No. 4, pp. 74- 
80) are still difficult to realize consistently. Therefore the author 
will be grateful to the readers for any advice and criticism. 

The book is composed in such a way that it is possible to use it 
both for studying in a college under the guidance of a teacher and 
for self-education. The subject matter of the book is divided into 
small sections so that the reader could study the material in suitable 
order and to any extent depending on the profession and the needs 
of the reader. It is also intended that the book can be used by 
students taking a correspondence course and by the readers who 
have some prerequisites in higher mathematics and want to perfect 
their knowledge by reading some chapters of the book. For this 
purpose we sometimes refer the reader to supplementary books (the 
bibliography is placed at the end of this course; the references are 
indicated by numbers in square brackets). We also supply the book 
with the name index, subject index and the list of symbols which 
enable the reader to find a desired definition, term or symbol. 

In some colleges analytic geometry and linear algebra are studied 
as independent courses. The structure of the book facilitates such 
a separation: the fundamentals of analytic geometry and linear 
algebra are given in Chapters II, VI, VII, X and XI. 

Some attention should be paid to the way of the numeration of 
the formulas and sections in the book. The sections entering into 
each chapter are numerated in succession beginning with the first 
number. In references inside each chapter we omit the number of the 
chapter. For instance, the expression “formula (2)” placed in the 
text of Chapter VI means “formula (2) of Chapter VI'’/ But when 
formula (2) of Chapter VI is mentioned in some other chapter we 
write “formula (VI.2)”. Similarly, “§ II.3” means “§ 3 of Chapter II” 
but we simply write “§ 3” when § 3 of Chapter II is referred to in 
this chapter; the expression “Sec. V.6” means “Sec. 6 of Chapter V” 
and so on. 

Studying the theoretical material should be followed by solving 
problems and doing exercises. For this purpose we can recommend 
the well-known collections of problems 12], [41, [26] and [47]. But it 
should be noted that some divisions of applied mathematics are 
not treated to a sufficient extent in these collections and therefore 
it is advisa'ble that a teacher of mathematics should add some inte- 
resting and instructive problems concerning these divisions. 
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The book can be of use to readers of various professions dealing 
with applications of mathematics in their current work. Modern 
applied mathematics contains, of course, many important special 
divisions which are not included in this book. The author intends 
to write another book devoted to some supplementary topics such 
as the theory of functions of a complex argument, variational calculus, 
mathematical physics, some special questions of the theory of ordi- 
nary diSerential equations and so on. • 

When preparing the book for the second edition the author con- 
siderably revised the text and added some new material including 
the chapter on the theory of probability*. Besides, the author has 
taken into account valuable advice and criticism received from 
many mathematicians, in particular from the members of the Mos- 
cow mathematical society where the book was discussed. Some 
sections of the book were written or revised under the influence of 
ideas and useful comments of L. M. Altshuler, Ya. B. Zeldovich 
and B. 0. Solonouts. To all of them the author expresses his warmest 
gratitude. 

A. D. My skis 

April 19, 19G6 


* This edition is the English translation of the second Russian edition 
of the book. The first Russian edition contained a chapter in which a brief 
review of basic equations of mathematical physics was given. The chapter was 
excluded from the second edition because of some changes in the syllabus of tech- 
nical colleges. We have included the material of this chapter in this English 
edition as the Appendix at the end of the book. The present translation incorpo- 
rates suggestions made by the author. — Tr. 
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Introduction 


1. The Subject of Mathematics. Numerical calculations are pene- 
trating into the fields of work of physicists, chemists and engineers 
of various specialities. The modern development of science and 
engineering makes it necessary to deduce and apply still more com- 
plicated laws, to solve very complicated problems and perform 
extensive calculations. 

All such calculations are based on mathematics, the science which 
treats of relations existing between spatial forms, quantities and 
magnitudes of the real world. All the basic notions of mathematics 
emerged and were developed in connection with the demands of 
natural sciences (physics, mechanics, astronomy etc.) and enginee- 
ring. The appearance of more complicated problems led to the crea- 
tion of more sophisticated mathematical methods of investigation 
(i.e. mathematical rules, techniques, formulas and the like) and, 
in particular, to the foundation of higher mathematics. It is there- 
fore not accidental that the fundamentals of higher mathematics 
were created in the 17th and 18th centuries, i.e. at the beginning 
of an intensive development of industry, although some elements 
of higher mathematics appeared as early as antiquity in the works 
of the great Greek mathematician and mechanician Archimedes 
(287-212 B.G.). 

Higher mathematics was founded in the works of the prominent 
French philosopher, physicist, mathematician and physiologist 
R. Descartes (1596-1650), the great English physicist, mechanician, 
astronomer and mathematician I. Newton (1642-1727), the great 
German mathematician and philosopher G. Leibniz (1646-1716), 
the great mathematician, mechanician and physicist L. Euler 
(1707-1783) and many other famous scientists. In their works diffe- 
rent divisions of mathematics were created for investigating phe- 
nomena of nature and solving engineering problems. In mathematics, 
as in other sciences, practical work is the main source of scientific 
discoveries. Another- important source is the need of mathematics 

2 * 
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itself to systematize the facts discovered, to investigate their inter- 
relations and so on. , „ „ 

2. The Importance of Mathematics and Mathematical Education. 
Mathematics and, in particular, higher mathematics plays a very 
important role in modern natural science and engineering. Mathema- 
tics lies in the foundation of all divisions of physics, mechanics 
and many divisions of other natural sciences, engineering and some 
other branches of knowledge. Designing the construction _ of an 
airplane or a dam of a hydro-electric power station, investigating 
complicated processes involved in deformation of metals, propa- 
gation of radio-waves, diffusion of neutrons in an atomic reactor etc. 
■cannot be performed without systematic application of mathe- 
matics. 

The high level of development of computational methods in the 
USSR is one of the main factors that led to the triumphant achieve- 
ments in launching the first artificial satellites of the Earth, space 
rockets and spacecraft. The creation of high-speed electronic compu- 
ters and other mathematical automatic devices leads to further 
extension of the application of higher mathematics and facilitates 
the introduction of computational methods into many new fields. 
In particular, this is the case in such fields as economics, manage- 
ment and control of industry, elaborating optimal (i.e. the best) 
plans of capital investments or construction, transportation problems, 
controlling technological processes, dispatching and so on. The 
application of mathematical methods in these fields has already 
proved to be very effective and profitable. In recent years mathema- 
tics has been penetrating into such traditionally “non-mathematical" 
fields as biology, physiology, geography etc. 

Therefore nowadays the requirements for mathematical educa- 
tion of an engineer are very high. An engineer must know the basic 
principles of higher mathematics and be able to apply them to con- 
crete problems. Then mathematics will become a powerful tool 
in his hands. Besides, a great deal of scientific literature and many 
special technical subjects are saturated with mathematical techni- 
ques and formulas. Without sufficient knowledge of mathematics 
much efiort is needed to understand all these formulas which may 
hinder the reader’s work and mislead him. Mathematics also faci- 
litates a better understanding of many questions related to other 
sciences (the theory of vibrations, mechanics of continuous media 
and so on). 

3. Abstractness. Mathematics itself is not a technical subject 
and therefore a course in mathematics for engineers and scientists 
must not treat any special technical questions. Its aim is to provide 
the necessary mathematical education. Therefore a student may 
sometimes feel that the questions treated in a course of higher mathe- 
matics are too abstract. But the abstractness of mathematics is one 
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of its most essential features. This does not at all mean that mathe- 
matics has little to do with practical activities.. On the contrary, 
it is the possibility to apply mathematics to various lands of acti- 
vities that makes its abstractness so important. For instance, in 
geometry we consider an "abstract” cylinder and find its volume. 
This immediately enables us to compute the volume of any concrete 
cylinder no matter whether it is a component of a mechanism or 
a column or a portion of space occupied by an electric field. Simi- 
larly, in higher mathematics we deduce some general abstract laws 
whose statements are not directly connected with a particular form 
of practical activity, a natural science or engineering, but the con- 
crete applications and realizations of these laws (which are studied 
as examples in a mathematical course) are always related to various 
phenomena of the real world. 

Thus, mathematics considers pure, ideal (schematized) forms, 
relations, processes etc. whose realization serves only as an appro- 
ximation to reality. For instance, a real cylinder can never he a 
perfect cylinder from the mathematical point of view. Here we see 
the manifestation of a distinguishing feature characteristic of any 
kind of human cognition: when considering a real object or process 
we always select a number of basic properties from an infinite variety 
of properties of the object or process and investigate these most 
essential properties abstracting them from inessential ones. But 
it may sometimes happen that an assumption, hypothesis, that all 
the properties except those chosen as basic ones are inessential is 
not true and then we can arrive at a contradiction between our 
mathematical inferences and reality. Such a possibility must never 
be forgotten! 

Because of the abstractness of forms and relations the logical 
consistency of inferences in mathematics is extremely important, 
more important than in other sciences, this being well known even 
from elementary mathematics. In higher mathematics too, all 
the assertions must be completely clear and logically justified so 
that it should be possible to regard them as objective laws adequate 
to reality. In mathematics, and particularly in higher mathematics, 
we also sometimes draw certain conclusions from experiment, obser- 
vation and analogy but nevertheless such a situation is rarer in 
mathematics than in other sciences. 

There is a characteristic tendency in mathematics to deduce all 
the assertions from a few basic principles (called axioms). This is 
the. so-called deductive method. But in our introductory course 
which is intended for those who are mainly interested in applications 
we shall not rigorously follow this method in all cases. The reader 
interested in theory may find some inferences in our course to he 
imperfect from the point of view of logic. If he wants to get a better 
understanding of some exceptions to general rules and to study higher 



22 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


mathematics more thoroughly, he should study a course wri 
for mathematicians, for instance, [14]. To consider the same ques 
from different points of view it is advisable to take other co\ 
intended for technical colleges (for instance, [5], [37], [44] and 
we particularly recommend book [5]). 

4. Characteristic Features of Higher Mathematics. There i: 
distinct boundary between elementary and higher mathema 
the division being conditional. These are not at all different sciei 
and the division is mainly accounted for by some historical rea 
as elementary mathematics and higher mathematics were ere 
in different historical epochs. But nevertheless we can point 
some characteristic features of higher mathematics. 

One of them is the universality, generality, of its methods 
an example, let us take the problem of finding volumes of so 
Elementary mathematics gives us different formulas for compu 
the volumes of a prism, pyramid, cone, cylinder, sphere and s 
other simple solids. Each formula is obtained on the basis of a 
cial argument which is rather complicated in certain cases. Bi 
higher mathematics we have general formulas expressing the vol 
of any solid, the length of any curve, the area of any surface 
the like. Take another example. Consider the problem of invest 
ting the motion of a material point under the action of given fo: 
In elementary courses in physics (based on elementary mathema 
methods) we study only uniform rectilinear motion, uniformly £ 
lerated rectilinear motion, uniformly decelerated rectilinear me 
and uniform circular motion, and it is rather difficult to investi 
other types of motion by means of techniques of elementary m£ 
matics. But the methods of higher mathematics make it pos; 
to investigate any type of motion which can be encounterei 
practical problems. 

There is another characteristic feature of higher mathemi 
(related to the above one). It is the systematic consideratio 
variable quantities. When investigating various objects and 
cesses by means of elementary mathematics we usually regard 
important quantities as velocities, accelerations, densities, ma; 
forces etc. as being invariable, constant (and yet we attain 
aim only in some simple cases). But if these quantities vary 
siderably (as is often the case) we cannot regard them as being 
stant. To solve such problems we usually apply higher mathema 
There is a branch of higher mathematics (called differential calci 
which is one of the earliest divisions of mathematics particul 
intended for solving various problems connected with an inv 
gation of the dependence of one quantity upon another. The q 
tities and their interrelations can be of any nature (for insta 
we can consider the relation between the acceleration, velc 
and path length of a motion or between the density, mass and i 
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and the like). Therefore differential calculus deeply penetrates into 
various natural sciences and engineering. 

The third characteristic feature of higher mathematics is the 
close relationship between its various divisions and the systematic 
unification of the computational, analytical (based on formulas) 
and geometric methods in contrast to elementary mathematics in 
which the connection between algebra and geometry is more or 
less accidental. In higher mathematics, the coordinate method re- 
duces geometric problems to solving algebraic equations, graphs 
are used for representing relations between variable quantities, 
analytical methods of integral calculus are applied for computing 
areas and volumes of geometric figures and so on. 

Some historical remarks will be given in due course in this book. 
But it is expedient to make some introductory notes here. The 
most important divisions of higher mathematics which now form 
the basis of the syllabus for engineers of many specialities were 
created in the 17th and 18th centuries. They include the coordinate 
method, differential and integral calculus etc. These divisions are 
represented in courses of higher mathematics for engineers mostly 
in the form they appeared after the works of Euler. L. Euler (a Swiss 
by birth) spent most of his life in Russia and died in Petersburg. 
Most of his works (473 out of 865) were published in Russia. His 
outstanding results in various divisions of mathematics, mechanics, 
physics and other sciences lie in the foundation of these divisions. 

Mathematics was created by scientists of many countries. Among 
Russian mathematicians we should mention N. I. Lobachevsky 
(1792-1856), the creator of a non-Euclidean geometry. He also obtai- 
ned some important results in other divisions of mathematics and 
initiated mathematical studies in Kazan. An intensive development 
of mathematics in Petersburg began with the works of the prominent 
mathematician Academician M. V. Ostrogradsky (1801-1862). The 
founder of the famous Petersburg mathematical school was the 
great Russian mathematician and mechanician Academician 
P. L. Chebyshev (1821-1894). He obtained many important results 
in various fields of mathematics and its applications to the theory 
of mechanisms, cartography etc. 

After Chebyshev most prominent representatives of the Peters- 
burg mathematical school were Academician A. A. Markov (1856- 
1922), a famous mathematician and the creator of the theory of 
random processes, and Academician A. M. Lyapunov (1857-1918), 
the founder of the theory of stability. 

Since the second half of the 19th century mathematical investi- 
gations have been developing in Kiev, Moscow, Odessa, Kharkov, 
and other Russian towns. 

5. Mathematics in the Soviet Union. In the Soviet Union there 
are many centres of mathematical research. Among prominent 
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Soviet mathematicians we should mention Academicians A. D. Alek- 
sandrov, P. S. Aleksandrov, N.( N. Bogolyubov, V. M. Glushkov, 
L. V. Kantorovich, M. V. Keldysh, A. N. Kolmogorov, M. A. Lav- 
rentyev, Yu. V. Linnik, N. I. Muskhelishvili, P. S. Novikov, I. G. 
Petrovsky, L. S. Pontryagin, V. I. Smirnov, S. L. Sobolev, A. N. 
Tikhonov, I. N. Vekua, I. M. Vinogradov, and others. 

Mathematics is being intensively developed both in old centres 
and in new ones in Baku, Erevan, Gorki, Lvov, Minsk, Novosibirsk, 
Rostov, Saratov, Sverdlovsk, Tashkent, Tbilisi, Vilnyus, Voronezh, 
and other towns. 

The role of mathematics in other sciences, industry and enginee- 
ring has considerably increased. Many mathematicians work out 
new theoretical problems of other branches of knowledge connected 
with applications of mathematics. At the same time many physi- 
cists, mechanicians and engineers take part in the development 
and applications of those divisions of mathematics which are related 
to their fields of work. As examples of fruitful unification of mathe- 
matics and its applications we can mention the works of the great 
Russian scientist and one of the founders of modern flight mechanics 
and hydromechanics N. E. Zhukovsky (1847-1921), the prominent 
Russian scientist, mathematician, mechanician and naval architect 
Academician A. N. Krylov (1863-1945), the prominent Soviet 
scientist in the fields of theoretical mechanics, aerodynamics and 
hydromechanics Academician S. A. Chaplygin (1869-1942) and 
others. 

There is no doubt that development of mathematical education 
will further increase the role of mathematics in our life and yield 
fruitful results. 
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Variables and Functions 


§ 1. Quantities 

1. Concept of a Quantity. It is difficult to give a strict definition 
of a quantity since the notion is extremely general and universal. 
Masses, pressures, charges, different kinds of work, lengths and 
volumes are examples of quantities. It will be sufficient for our 
further aim to regard as a quantity everything that is expressible in 
certain units and completely characterized by its numerical value. 
For instance, masses are measured in grams or kilograms and the 
like. We can say that the area of a circle is a quantity since it is 
completely characterized by its numerical value (for example, 5, xc 
etc.) if we measure it in certain units, e.g. in square centimetres. 
The circle itself regarded as a geometric figure is of course not a quan- 
tity because it is characterized by a certain geometric form which 
cannot be expressed numerically. 

Many notions which were originally understood only in a quali- 
tative aspect have been recently “advanced” and transferred to the 
class of quantities (for instance, such notions as effectiveness, infor- 
mation and even likelihood). Every change of this kind is a great 
event since it enables us to apply quantitative mathematical me- 
thods to investigating the corresponding notions and this usually 
turns out to be very effective. 

2. Dimensions of Quantities. A unit measure which is used for 
expressing a quantity is called the dimension of the quantity. For in- 
stance, the gram or the kilogram usually serves as the dimension 
of mass. The dimension of area is the square centimetre or the square 
metre and so on. A dimension is denoted by square brackets. For 
instance, if M is a mass and 5 is an area then [M] — kg (the kilo- 
gram) and [51 = m 2 (the square metre) in the international system 
of units. 

Usually the units of some quantities are regarded as fundamental 
units whereas the units of all other quantities are derived units 
expressed in terms of the fundamental ones. For instance, the units 
of length (m), of mass (kg) and of time (sec) are the fundamental 
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units in the international system of units (SI), and the unit of velo- 
city (m/sec) or of force (kg •m/sec 2 ) is expressed in terms of the fun- 
damental units. 

We can add together and subtract only quantities of the same 
dimension, the dimension of a sum being that of the summands. 
It is permissible to multiply or divide quantities of arbitrary di- 
mensions. The multiplication or division of quantities yields, res- 
pectively, the multiplication or division of their dimensions. 

We also consider dimensionless (“abstract”) quantities. For instance, 
the ratio of two quantities of the same dimension is dimensionless. 
The numerical value of the ratio of a quantity to the chosen unit 
measure is also dimensionless. For example, the numerical value 
of the mass of 5 kg is the “dimensionless mass” 5. We can also obtain 
a dimensionless mass if we take the ratio of the mass to a certain 
mass which is characteristic of the process in question (such a mass 
is supposed to be well known, and we choose it as a standard to 
compare with). Dimensionless length, time etc. are introduced in 
like manner. 

In mathematics we usually regard quantities as dimensionless. 
Finally, a dimensionless quantity is completely characterized by 
its numerical value, and its “unit measure” is the number 1. 

3. Constants and Variables. A quantity entering into an investi- 
gation can take on either different values or only one fixed value . In 
the first case we call the quantity a variable quantity or, in short, 
a variable, and in the second case we call the quantity a constant 
(a constant quantity). Suppose we consider the water in a basin. 
The water pressure measured at different points of the basin is 
a variable since it varies and is different at different points. At the 
same time the water density can he regarded as a constant since 
it takes on one and the same value (with a sufficient degree of accu- 
racy) at different points. As another example let us consider the 
process of compressing a given mass of a gas while the temperature 
is kept constant. Then the pressure and the volume are variables 
whereas the mass and the temperature are constants. But it should 
be noted that in a real process the last two quantities inevitably 
vary a little. Hence, we can schematize the process and conditionally 
regard the mass and the temperature as constants only in case their 
real variations are of no importance for our investigation. And in 
many other cases the constancy of some quantities should be under- 
stood in a conditional sense. We must never forget it since, if we 
regard a quantity as a constant in a process in which the variations 
of the quantity, small though they may be, are essential for the 
investigation, we may arrive at wrong conclusions and our schema- 
tized model will not apply. 

. It may happen that a quantity which is constant in a certain 
treatment of a phenomenon takes on a different value or even becomes 
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a variable under some other (though similar) circumstances. The 
constant quantities of this kind are called the parameters of the 
process; they are the characteristics of the process. For example, 
the mass and the temperature of a gas are the parameters of the 
process of isothermal compression. When we deal with an electric- 
light bulb we take into account such parameters as the resistance, 
the supply voltage the bulb is designed for and the power consump- 
tion. Even in this case there are some other parameters which may 
also be taken into account (for instance, the sizes of the bulb) but 
usually we do not regard these parameters as basic ones. Generally, 
in all cases it is very important to choose the basic, the most signi- 
ficant parameters among various parameters characterizing an object. 

4. Number Scale. Slide Rule. Quantities can be represented 
visually by means of a number scale. For this purpose we usually 
take a rectilinear axis with a uniform scale. To construct a number 
scale we choose a straight line and a point on the line which serves 
as the origin (the origin is usually designated by the letter 0). We 
choose one of the directions on the straight line as the positive 
direction and take a certain line segment as a unit of length (the 
positive direction is indicated by an arrow; see Fig. 1). Setting off 

. /sec 

N 0 M 

H 1 i im 1 1 H 1 1 h n $ t n n n n 1 1 1 w-t tf ft i m h i 1 1 1 *- 

-2 -1 0 / 2 3 t 


Fig. 1 


the unit segment from the origin in both directions and repeating 
the procedure infinitely we obtain all the points which correspond 
to the integral values of the quantity. Between the “integer points” 

there are points representing fractional values, both rational (such 

* \ ' 

2.03 etc.j and irrational (that is fractional numbers that 
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are not rational, e.g. 


—n and the like). In case we have 


a dimensional quantity the segment chosen as the length unit also 
acquires the corresponding dimension. For example, the numerical 
values oi time £ depicted in Fig. 1 are expressed in seconds; we also 

there 116 POmtS N ( * = _1 Sec) ’ ° {t = 0 sec > and M = 1-37 sec) 

To each value of the quantity there corresponds a certain point on 
fj? nSS 1 , scale and ' conv ersely, each point on the number scale cor- 
tn P ° S t0 a ceri f lTl value of the quantity. Besides, there is a one- 
to-one correspondence between the values of the quantity and the 
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points, that is each point corresponds only to one value and vice 
versa. (Here and further we consider real quantities, that is quan- 
tities which take on only real numerical values; complex quantities 
will be treated in Sec. VIII. 1.) On the basis of these properties 
we often identify the values of a quantity with the corresponding 
points; we simply say “the point t = 1.37 sec” and the like. 

If a quantity is variable it is represented by a point which can 
occupy different positions on the axis (on the number scale). For 
example, such a point can move along the axis as time passes. In 
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case the quantity is constant the corresponding point occupies 
a fixed position and does not move. A point which represents a 
variable is called a variable point (moving point, current point). 

In practical applications we try to choose the origin and the unit 
of length in such a way that the range of variations of the quantity 
should be represented in the most suitable manner. The origin 
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itself is sometimes not depicted because we can draw only a part 
of the infinite axis. For example. Fig. 2 represents the number scale 
on which the values of the length of a rod subjected to thermal expan- 
sion are shown. 

It is sometimes convenient to use scales which are non-uniform. 
For instance, logarithmic scales are often of use (see Fig. 3). A num- 
ber n > 1 is represented on such a scale by a point which is ob- 
tained by drawing the line segment of length Ic log n (where k is 
a factor of proportionality suitably chosen) in the positive direction 
from a point A. Positive numbers n <; 1 are obtained on the loga- 
rithmic scale by drawing the segment fc|log n\ in the negative 
direction from A because for such n we have log n < 0. 

A logarithmic scale is, in particular, utilized in the construction 
of a slide rule. The instrument consists of a ruler and a slide which 
are graduated with similar logarithmic scales. To understand the 
principle of a slide rule let us suppose that the scales are shifted 
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-with respect to each other (see Fig. 4) so that two points a and b 
on the lower scale coincide with the corresponding points a and b 
on the upper scale. Then we have 

k log b — k log a = k log b' k log a' 

since the lengths of the shaded line segments are equal. Now after 
some simple transformations we obtain (check it up!) 


Three of the values a, b, a’ and b' being given, we can read the fourth 
value on the slide rule. This fourth value will satisfy relation (1). 

, b 

If, for example, we put a' = 1 then b = ab or a = -p ~ . Consequently, 

to determine the product of two given numbers a and b' we must 
make the point 1 on the slide coincide with the point a on the ruler 
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Fig. 4 

and then read the value of the product which is indicated on the 
ruler by the point b' of the slide. (Think how to find the quotient 
of two given numbers.) It is sometimes convenient to put b' — 10 
instead of a' = 1 and to move the slide not to the right hut to the 
left with respect to the ruler. The slide rule was invented in the 
17th century. It is widely used now and facilitates the work of 
many technicians, engineers, physicists etc. Supplementary scales 
on the slide rule make it possible to perform various additional 
operations including extracting roots, taking logarithms, raising, 
solving equations of different types and so forth. There is a number 
of handbooks on using the slide rule, for example, [36] andj [43] 
to which we refer the reader. 

Some curvilinear scales are also of use in certain cases (for example, 
see Sec. IX .1). But in our course we shall usually use rectilinear 
axes and uniform scales for representing quantities unless the con- 
trary is explicitly stated. 

5. Characteristics of Variables. A variable which takes on all 
the numerical values or all the values lying between some limits 
is called continuous. On the contrary, a variable which assumes 
-certain “separated” values is called discrete. 
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The set of all numerical values which may be assumed by a variable 
is called the range of the variable. 

Now we introduce the notion of an interval which is of use for 
characterizing ranges of some types of variables. 

A finite (bounded) interval is the set of all numbers contained 
between two given numbers a and b. The numbers a and b are called 
the end-points of the interval. The end-points a and b may or may 
not be included into the interval and this fact should be sometimes 
indicated. Respectively, in the first case we call the interval closed 
(i.e. when a ^ x ^ b and the end-points are thus included) and 
denote it as [a, b] and in the second case we say that the interval 
is open (i.e. a < x < b and the end-points are excluded) and denote 
it by (a, b). Finite intervals are represented by line segments on 
the number scale. 

There are also unbounded (infinite) intervals for which a or b 
or both a and b may be infinite. For example, if a variable x assumes 
all possible values greater than some constant number a the range 
of the variable is described by the inequalities a < x < oo. This 
is an example of an infinite interval; it has no finite right end-point, 
of course, but in such a case we say, conditionally, that the right 
end-point is at infinity. An interval of this kind is also said to have 
no upper bound since in case a variable may increase unlimitedly 
we usually interpret the variable as “rising up”. The notion of a 
lower bound is understood in just like manner. The collection of 
all real numbers is an interval with neither lower nor upper bound 
(that is, geometrically, the whole number scale). 

The range of a continuous variable is an interval or a collection 
of some number of intervals. For example, if a triangle ABC is defor- 
med in all possible ways the corresponding angle A is a continuous 
variable whose range is the interval 0 < /_ A < it (in case the nume- 
rical values of the angle are expressed in radians). At the same time 
the area S of the triangle has the interval 0 ■< S <C oo as its range 
(of course, here we also mean that the numerical values of the area 
are measured in certain units but we are not going to mention details 
of this kind in all cases in future). The range of a discrete variable 
is a set (finite or infinite) of separate real numbers. We can also 
say, in the geometric sense, that such a range consists of separate 
points (but not of entire intervals). For example, let an index assume 
the values 1, 2. . . ., n. Then it is a discrete variable. 

If a variable changes in a certain process in such a way that its 
numerical values vary only in one direction, that is they either 
increase or decrease, it is called monotonic. The point representing 
a monotonic variable on a number scale moves in one direction. 

It is inconvenient to consider constant quantities apart from 
variables and therefore we can regard a constant quantity as a spe- 
cial case of a variable, i.e. a variable which all the time assumes 
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only one fixed value (the same idea is used in mechanics when the 
state of rest is regarded as a special case of motion). The range of 
a constant consists of only one point. 

We say that a variable changing in a certain process has an upper 
bound if all the time it remains smaller than a constant (such a 
constant is called an upper bound of the variable; it is clear that 
a variable having an upper bound has in fact an infinitude of upper 
bounds because every constant greater than a given upper bound 
of the variable can serve as a new upper bound). We likewise define 
the notion of a lower bound of a variable. Of course, a variable 
may not have an upper or a lower bound (or either of them). If a 
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variable has both an upper bound and a lower bound it is simply 
called a bounded variable. Variables having upper bounds (lower 
bounds) are called bounded above (bounded below). 

When investigating different quantities we often use the notion 
of the absolute value of a quantity. As is well known from elemen- 
tary mathematical courses, the notion is defined in the following 
way: 

| a | = a if a 0 and | a | = — a if a <C 0 

For instance, | 5 j = 5, | 0 i = 0 and i —5 1 = 5 [that is I —5 | = 
= -(-5) =5]. 

Absolute values possess the following simple properties: 

1. ltt+6|^|a|+|6|. The inequality is strict in case a 
and b have opposite signs and it turns into the equality if otherwise. 

2. For any a and b we have 

I ab 1 = | a | • | b | and ‘j/'a’ 5 = | a | 

The significance of the last formula is sometimes underestimated 
in elementary mathematical courses and this may be the cause of 
different errors and false conclusions. 

The quantity | a — b | = [ b — a | is equal to the distance 
between the points a and b lying on the number scale. The inequality 
I x | < h (h>Q) defines the interval —hCxCh, and the ine- 
quality 1 x — a \<zh defines the interval — h < x — a < h, i.e. 
? “ ^ < * < a h. An interval of the form a — /: < j < a -)- /t 
is called an ^-neighbourhood of the point a. The intervals are shaded 
m Fig. 5. 



32 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


§ 2. Approximate Values of Quantities 

6. The Notion of an Approximate Value. It is usually impossible 
to speak about the absolutely precise value of a physical quantity. 
For example, we can never determine the exact value of the length 
of a real object. This is so not only because our measurements are 
imperfect but also because of a complex form of the body which 
makes it impossible to indicate exactly the points between which 
the length should be measured. If we recall that the object consists 
of molecules which are in permanent motion we see that the situa- 
tion becomes still more complicated. Moreover, there is a vast majo- 
rity of cases when the determination of a length with a great accuracy 
is inexpedient and senseless even when the modern level of measu- 
rement techniques makes such an accuracy attainable. For instance, 
if w r e have to design or measure a dwelling-house it would be obvio- 
usly senseless to determine the sizes of the building with the accu- 
racy to within 0.01 mm. The same can he said about masses, pres- 
sures etc. The numerical values of almost all quantities in physics 
and engineering (for example, the values of all continuous variables) 
are therefore approximate. 

Mathematical operations on approximate values of quantities 
are called approximate calculations. There exists a special branch 
of science devoted to approximate calculations and we shall study 
some of its rules later on. A. N. Krylov (1863-1945) was one of the 
initiators of developing approximate calculations in the USSR. His 
book [28] (the first edition was in 1911) still retains its significance. 

The appropriate choice of a degree of accuracy for calculations, 
measurements or for manufacturing machine elements is a very 
important operation. When making such a choice one should take 
into account a great many factors, i.e. our requirements, technical 
means, economy etc. 

7. Errors. Let A be the exact value and a an approximate value 
of a quantity. Then the error, that is the deviation of the appro- 
ximate value from the exact one, is equal to A — a. It may be posi- 
tive or negative. As a rule, we do not know the error exactly since 
the exact value A is unknown. Therefore we usually consider the 
limiting errors a t and a 2 which form an interval containing the true 
error: 

a t <C A — a < a 2 , i.e. a-V a i < ^-^< a - 1 r cc z 

Thus the value of the quantity A is estimated from two sides. For 
instance, the formula of the length L = 9Zo;i mm means that the 
true value of the length lies between 9 — 0.1 = 8.9 mm and 9 -j- 0.2 = 
= 9.2 mm. 

It is sometimes inconvenient to consider- two limiting errors and 
therefore we often indicate the maximum absolute error a, that is 
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a value which exceeds the absolute value of the error: 

| A — a | < a, i.e. — a < A — a < a or 
a — a < A < a + a 

For example, suppose that the measurement of a length l results 
in the value 137 cm and that we can guarantee the accuracy of 
0.5 cm. This means that we have a == 0.5 cm and 136.5 cm < l <£ 
< 137.5 cm. Therefore we can write l = (137 ± 0.5) cm. 

The maximum absolute error of a measurement does not characte- 
rize it completely. For instance, if we are told that the maximum 
absolute error is equal to 1 cm we do not yet know whether this is 
a great error or not. Indeed, for example, if it were the length of 
a whale or of a beetle our. judgment would vary respectively. 

The quality of a measurement is better characterized by its maxi- 
mum relative error 6 which is calculated by the formula 



The maximum relative error is dimensionless and we often express 
it in per cent. The value of a relative error is usually rounded for 
the sake of simplicity. For instance, the relative error in per cent 


for the example of measuring the length l is equal to 


0.5 x 100 
137 


= 0.36 & 0.4, i.e. we can say that the maximum relative error of 
the measurement is equal to 0.4% (or, rounding, to 0.5%). 

The accuracy of the order of 1% or even of 10% is sufficient for 
many approximate calculations. On the other hand, the precise 
measurement of the frequency of electromagnetic vibrations which 
creates the basis of performing the automatic control of spacecraft 
is carried out by means of a crystal or an atomic clock whose error 
in time observation is about 10~ 4 sec a day (calculate the correspon- 
ding maximum relative error!). 

8. Writing Approximate Numbers. It is desirable to write an 
approximate number, i.e. an approximate value of a quantity, 
in a form which indicates the degree of accuracy. Therefore an 
approximate number is usually written in such a way that all the 
decimal digits except the last one are correct. The admissible error for 
the last decimal digit must not exceed unity (by the way, if the error 
is a little greater one usually admits it). For instance, if we write 
the value R =]1.35 £2 of a resistance we mean that a, R — 0.01 £2, 
that is in fact 1.34 £2 < R < 1.36 £2. There is a great difference 
between the formulas R = 1.35 £2 'and R = 1.3500 £2 since the 
former indicates that the corresponding calculations were carried 
out with a possible error of 0.01 £2 whereas the latter expresses the 
result accurate to 0.0001 £2. If the result of certain calculations 
is i? — 2.377 £2 but the third decimal digit may be incorrect, or 
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if we are not interested in the fourth decimal digit, we should round 
off the result and write R = 2.38 Q. 

The number of decimal places to the right of the decimal point 
indicates the maximum absolute error. The total number of correct 
decimal digits (which does not include zeros standing to the left 
of the first nonzero decimal digit) indicates the maximum relative 
error. For instance, the numbers 2.57; 1.7100; 0.015 and 0.00210 
have, respectively, 3, 5, 2 and 3 correct decimal digits. The greater 
the number of correct decimal digits, the smaller the maximum 
relative error. 

One should avoid writing expressions of the form M = 1800 g 
since such. a form often does not indicate the real accuracy of measu- 
rements or calculations. If we suspect that the second decimal digit 
may be incorrect we must write M — 1.8 X 10 3 g, and if the fourth 
decimal digit is doubtful we must write M = 1.800 X 10 s g. Strictly 
speaking, the formula M — 1800 g means that the maximum absolute 
error is equal to 1 g. When the rules of writing approximate numbers 
are not followed we often encounter various misunderstandings. 

9. Addition and Subtraction of Approximate Numbers. Let us take 
an example. Suppose we have weighed a bottle and its cork. Let 
their masses be, respectively, M = 323.1 g and m = 5.722 g (this 
indicates that the scales taken for weighing the cork are more precise). 
It Avould be wrong to calculate the total weight of the bottle together 
with the cork as 

M = 323.1 
m— 5.722 
M - 1-771 = 328.822 g 

Actually, the weight of the bottle is found with an accuracy 
of 0.1 g and therefore the hundredths and the thousandths entering 
into the result are not only unnecessary but even misleading: judging 
by the answer one might think that the value of M -f- m is accurate 
to 0.001 g which is wrong. To perform the addition correctly we 
must therefore round off m to 0.1, that is we must calculate in the 
following way: 

M = 323. 1 
m — 5.7 

M + m = 328.8 g 

Of course, we should obtain the same result if we rounded the former 
result but such an operation involves the unnecessary calculations 
which should have been avoided. Thus, the number of decimal digits 
entering into a sum must be the same as in the summand Avith the 
greatest absolute error. 

If there are many summands the round-off errors may add up 
and this may result in a great error in determining the sum (some 
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kind of systematic “short measure”)- There is a rule of a reserve 
decimal digit which is recommended for such -cases: the calculations 
are carried out to an extra decimal digit and then we round off 
the result discarding the reserve decimal digit after the sum has been 
calculated. 

For example, let it he required to find the sum 

K = 132.7 + 1.274 + 0.06321 + 20.96 + 46.1521 


The first summand has the largest absolute error which is equal 
to 0.1. Therefore we round off all the other summands to 0.01: 

132.7 + 1.27 + 0.06 + 20.96 + 46.15 - 201.14 

Now, rounding, we find K = 201.1. If we did not use the rule the 
result would he less accurate: 

K = 132.7 + 1.3 + 0.1 + 21.0 + 46.2 = 201.3 

Another example. Suppose it is necessary to find the sum N = 
= ]/"5 + ]/~6 + ]/~ 7 + ]/"8 with an accuracy of 0.01. The integers 
under the radical signs are regarded as quite exact. Using the rule 
of a reserve decimal digit we take from a table the values of the 
roots accurate to 0.001: 

2.236 + 2.449 + 2.646 + 2.828 = 10.159 
Thus, N = 10.16. 

If the number of summands is very large, say several hundreds, 
it is advisable to use two reserve decimal digits. 

When we add several summands given with the same number of 
decimal digits to the right of the decimal point we must take into 
account that the maximum absolute error of the sum can be greater 
than those of the summands. It is therefore expedient to round the 
result to the preceding decimal place. For example, let 

L = 1.38 + 8.71 + 4.48 + 11.96 + 7.33 

Adding we get L — 33.86. But the last decimal digit is likely to 
be incorrect and therefore we should write the answer in the form 
L = 33.9. 

The maximum . absolute error of a sum or of a difference is equal to 
the sum of the maximum absolute errors of the operands. For instance, 
if we have two quantities determined with an accuracy of 0.1 then 
it is easy to understand that the sum and the difference of the quan- 
tities are determined with an accuracy of 0.2 because the errors may 
add up. If there are many summands it is unlikely that all the errors 
would add up. In such cases one should use the methods of the theory 
of probabilities (see Sec. XVIII. 15) in order to estimate the error 
of the sum. These methods imply that we should round the sum 
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discarding one decimal digit (as it was done above in calculating L) 
beginning with, approximately, five summands and two decimal 
digits approximately with 500 summands. 

The rules of subtraction are essentially the same as the rules of 
addition of approximate numbers. At the same time we should take 
into account that when subtracting approximate numbers which 
are close to each other we may get a considerable increase of the 
relative error. For instance, let it be necessary to calculate P = 
= 327.48 — 326.91. The minuend and the subtrahend have a = 0.01 

and therefore 8 « ~~ 100% = 0.003%. But the maximum absolute 
error of the difference P = 0.57 is equal to 0.02 and hence its maxi- 
mum relative error is 8 P — 100% =3.5%. The relative 

error has thus increased 1000 times! 

Therefore one must try to avoid calculating the difference of two 
close numbers. In such a case we should transform the corresponding 
expressions in an appropriate way in order to find the difference 
without actually carrying out such a subtraction: one must not try 
to determine the weight of one’s hat by weighing oneself first with 
the hat on and then without it! 

When we deal with formulas containing differences of this kind 
which can noticeably affect the accuracy of calculations we should 
eliminate the differences by transforming the expressions. For 
example, calculating an expression of the form Q = a — ]/ a 2 — b 2 
(a > 0, b > 0) where b is several times smaller than a (and there- 
fore V a 2 — b~ » Y = a ) we can transform the expression in the 
following way: 

^ (a — ~[/a 2 — b 2 ) (a -f- ~[/a 2 — ft 2 ) 6 2 

a -j- ~\/ a 2 — b 2 a -f- ~\/a 2 — 6 2 

The last expression no longer contains the undesirable difference. 

10. Multiplication and Division of Approximate Numbers. General 
Remarks. Let us begin with an example. Suppose it is necessary to 
determine the area S of a rectangle with the sides a = 5.2 cm and 
b = 43.1 cm. It would be wrong to give the answer S = 5.2 X 
x 43.1 = 224.12 cm 2 . 

In fact, a is contained between 5.1 and 5.3, and b between 43.0 
andf43.2. Thus, the area is contained between 

Si = 5.1 X 43.0 = 219.3 cm 2 and 
S 2 = 5.3 X 43.2 = 228.96 cm 2 

We see' that all_the decimal digits. beginning with the second one in 
the above value of S may be incorrect and therefore they may only 
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lead to misconceptions. In this case the correct answer we must give 


is S = 2.2 X 10 2 cm 2 . _ . 

By the way, we note that the calculation of S i and S 2 demonstrates 
the way that can be followed in estimating the results in other pro- 


JJJLCJXLO. . _ 1x1 

Thus we see that in multiplying two numbers with two and three 
correct decimal digits we must retain two decimal digits in the 
answer. The same rule holds for the general case of multiplication 
of approximate numbers and also for their division: the number 
of correct decimal digits in the result must he equal to the smallest 
of the numbers of correct decimal digits in the factors (or in the 
divident and the divisor in the case of division). The reason for 
this general rule is that, in the first place, the operations of multi- 
plication and division performed on approximate numbers yield 
the addition of the corresponding maximum relative errors (this 
will be shown in Sec. IX. 11) and, in the second place, the number 
of correct decimal digits and the maximum relative error indicate 
similar qualities connected with the degree of relative accuracy. 

In the example of calculating S the maximum relative error of b 
is considerably smaller than that of a and therefore 6 S = 5 a 4* 
-f 8 b 5 a , that is S has the same number of correct decimal digits 


as a. 


If the factors entering into a product are given with different 
numbers of correct decimal digits we must round the numbers before 
multiplying them and retain one reserve decimal digit which is 
discarded after the operation is performed. In case the factors have 
the same number of correct decimal digits but there are many factors 
(for instance, more than four) it is advisable to reduce the number 
of correct digits in the product by one. 

As an example, let us take the formula Q = 0.24 J-Rt which is 
applied to calculating heat generated by an electric current. In 
this case the answer cannot have more than two correct decimal 
digits because the coefficient 0.24 has only two correct digits. There- 
fore there is no sense in taking /, R and t with more than three correct 
decimal digits (moreover, the third digit is taken only as a reserve 
digit). If a more accurate value of Q is desirable we must first of all 
specify the value of the coefficient. 

It should be noted that absolutely exact factors do not affect 
the choice of the number of correct decimal digits in a product. For 
instance, the coefficient 2 entering into the formula L = 2 nr of the 
circumference of a circle is absolutely exact (we can write it as 
2.0 or 2.00 etc.) and therefore the accuracy of calculations depends 
only on the number of correct decimal digits to which n and r are 
computed. , 

Let us take an example involving all the above rules. Let D — 
— 1.3* X 5.4 -}- 0.381 X 9.1 + 7.43 x 21.1. In order to estimate 
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the magnitude of the summands we calculate them rounding to one 
correct decimal digit. Thus we get 500. 3.6 and 140. Hence the 
sum is of the order of several hundreds. The factor 5.4 entering into 
the first summand (which is the largest one) has only two correct 
decimal digits and thus the whole result must have two correct 
digits. Now, according to the rule of a reserve decimal digit, we 
must calculate to within unity and then round off the result to the 
nearest ten. Thus we obtain D = 690 + 3 -f- 157 = 850, i.e. D = 
= 8.5 X 10 2 . 

Calculations with unnecessary digits are not only useless but 
even misleading because they may give the illusion of an accuracy 
greater than that we actually have. 

The choice of a degree of accuracy of approximate quantities for 
performing mathematical operations on them is made in accordance 
with a general principle which states that all the degrees of accuracy 
which we choose must be coherent to each other at every stage of 
our calculations. This means that none of the degrees must be too 
great or too low. 

We shall take an example to illustrate the principle. Suppose 
we have to calculate the area of a rectangle by the formula S = ab. 
Let a be measured or calculated to three correct decimal digits. 
Then we must take b also with three correct digits because the fourth 
decimal digit of b would be useless whereas if we determined b 
only with two correct digits the efforts applied to finding the third 
digit of a would be futile. Therefore when we calculate a product 
it is convenient to take the factors (at least those factors which are 
difficult to determine) with the same number of correct decimal 
digits. Similarly, the summands entering into a sum must be taken 
with the same number of decimal digits to the right of the decimal 
point. 

Here we give an example. Let the expression M — ab + cd 
be calculated and let it be known that a tv 30, b tv 6, c « 0.1 
and d tv 40. Suppose that a is taken with three correct decimal 
digits. What number of correct digits should be chosen for b, c 
and d? It is clear that we must take three correct decimal digits 
for b according to the accuracy of a. Further, we have ab tv 180 and 
cd tv 4. This implies that for calculating M with three correct 
digits (the accuracy of a makes it impossible to obtain M with 
more than three correct digits) it is sufficient to determine c and 
d with only one correct decimal digit. If it is not too difficult the 
accuracies of b, c and d should be increased by one decimal digit 
but the extra digit is only a reserve one. 

When performing practical calculations we often face a problem 
which is in some sense inverse to the above problem. The degree 
of accuracy of a desired result is sometimes set beforehand according 
to some prerequisites and then it is required to determine the necessary 
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degrees of accuracy of the quantities involved into the calculations 
(and the accuracy of the calculations). Some of the quantities may 
he obtained as a result of an experiment and therefore our discussion 
also applies to the determination of a desirable precision of an 
experiment. The solution of the inverse problem is based on the 
rules of approximate calculations we have studied here. For example, 
suppose we have to calculate the total surface area of a circular cy- 
linder by the formula S = n ( DH -f ). Let. it be approxima- 
tely known that D & 20 cm and H « 2 cm. Then S « 700 cm 2 
(check it up!). Now turning to the inverse problem and reasoning 
as in the preceding paragraph we see that if, for instance, we want 
to have the result with three correct decimal digits, i.e. with an 
accuracy of 1 cm 2 , then ji and D should he taken with three correct 
decimal digits and H with two correct digits. Thus, measuring D 
and H we must attain the accuracy of 1 mm. It is better to calculate 
with a reserve decimal digit, and it should also be taken with a re- 
serve digit. But if we wanted to have more accurate values of D 
and H we would have to perfect our measuring instruments. 

The rules of determining degrees of accuracy for more complicated 
formulas will be given in Secs. IV. 10 and IX. 11. 

§ 3. Functions and Graphs 

11. Functional Relation. When investigating a phenomenon or 
a problem we often deal with several variables which are interrelated 
so that a change of one of the variables affects the values of the 
others. Then we say that there is a functional relation between the 
variables. For example, suppose a mass of a gas is kept under chan- 
ging conditions. Then there is a functional relation between the 
volume 7, the temperature T and the pressure p of the gas because, 
as is well known from physics, the quantities are interrelated. We 
also have a functional relation between the area of a circle and its 
radius, between the distance passed over in a process of motion and 
the time taken and so forth. 

Usually it is possible to pick out certain variables from a number 
of interrelated quantities such that the values of the variables can 
be taken arbitrarily whereas the values of the other quantities are 
determined by the values of the variables entering into the first 
group. The variables of the first type are called independent vari- 
ables (or arguments) and the variables of the second type are called 
dependent variables (or functions). As an example let us consider 
the relationship between the area S of a circle and the length R 
of its radius. It is natural to regard R as an independent variable 
and choose its values arbitrarily; then the area computed by the 
formula S = it R 2 is a dependent variable in this functional relation. 
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In the example of the mass of gas we could have taken V and T 
as independent variables. Then the variable p (the pressure) would 
have been regarded as a dependent variable. 

A rule (a law ) according to which to the values of independent variables 
there correspond the values of a dependent variable is called a function. 
Thus, every time there is a law of correspondence between the values 
of variables we say that there is a functional relation. The concept 
of a function is one of the most important mathematical notions. 

By the way, the term “function” is sometimes used in a different 
sense. As it has been mentioned, independent variables are called 
arguments and a dependent variable itself is called a function. 
Such a twofold sense of the term does not, however, lead to any 
misunderstandings. 

It should be noted that when we have a functional relation between 
variables the distinction between the independent variables and 
the dependent ones is sometimes conditional. For instance, in the 
example of the mass of gas we could have taken T and p as indepen- 
dent variables and V as a function. We can easily construct the 
scheme of an experiment in which T and p can be varied arbitrarily 
whereas V depends on T and p. Of course, the choice of variables 
which are regarded as independent quantities may be important 
in some cases. The choice should be made in a natural and conve- 
nient way in accordance with the circumstances. 

Functions may depend on one argument (as in the example of the 
area of a circle) or on two or more arguments. In the first two chapters 
of our course we shall consider (almost without exceptions) functions 
of one independent variable. 

We must note that when we regard a quantity y as a function 
of an independent variable x we do not necessarily suppose that 
there is a meaningful causal relationship between the variables. 
It is quite sufficient if there exists a rule which attributes a certain 
value of y to each x even if we do not know this law of correspon- 
dence. For example, the temperature 0 at a point in space can be 
regarded as a function of time t since it is clear that we always have 
a certain temperature at the point at each moment t, that is to the 
values of t there correspond the values of 0, although the variations 
of 0 cannot be simply accounted for by the changes of t since in 
reality these variations are determined by some complicated physical 
laws. 

12. Notation. If a variable y is a function of a variable x we 
usually write y — f (x) (this is read as “y is equal to / of x") where 
/ is the sign of a function. If we make x assume certain particular 
(concrete) values the function will assume its particular values. 

For instance, let y — f {x) be of the form y = x 2 . Then y = 4 
for x — 2, y = 0.36 for x — — 0.6 etc. This can be written as 
/ ( 2 ) = 4, / (—0.6) = 0.36 and so on, or as y | x=2 = 4, y l^-o.o = 
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= 0.36 etc. The vertical lines in the last expressions are the signs 
of substitution which mean that we substitute the values of x for 

the argument. , 

The notation y = f (x) is used when the concrete expression of 
a function is too complicated or when we do not know the expression. 
It is also used for formulating general rules and properties of all 
functions or of many concrete functions. For example, the formula 
(a -f bf = a? + 3 a"b + 3 ab 3 -f b 3 which is well known from al- 
gebra is written in letters. Here the letters a and b are not concrete 
numbers but they can be replaced by any numbers. 

If we consider several functions simultaneously we can use, besi- 
des /, any other letters: F, <p, CD etc. We can also introduce diffe- 
rent subscripts, superscripts, and other indices: / l5 / 2 , F ® etc. 
At the same time when we consider different problems we can 
denote different functions by the same letter /. We remind the 
reader that we have a similar situation in algebra: a letter, say 
the letter a, may denote different quantities in different problems, 
but we must not denote by a different quantities entering into one 
and the same problem. On the other hand, different quantities 
may sometimes be connected by one and the same functional rela- 
tion. In such a case we can use one and the same letter / because 
/ designates the law of dependence of one quantity upon another 
and is irrelevant to the way the quantities are denoted. For example, 
if y = a: 3 , z — u 5 and v — t 3 then we can write y — f ( x ), z = tp (u) 
and v = / (t). In this case the sign / indicates raising to the third 
power whereas <p indicates the operation of raising to the fifth power. 

Functions of several variables are denoted similarly. For instance, 
let z — x 3 — x2 v . Here x and y are independent variable's and z 
is regarded as a dependent variable. We can write z — f (x, y) where 
the comma is a sign indicating the dependence of the function on 
two arguments. In this case particular values are found in the fol- 
lowing way: 

/ (2, 1) = z *= 2 = 2 2 - 2 x 2 1 - 0; 
v=i 

f (1, 2) = z K=1 = l 2 - 1 x 2 2 = -3 

y= 2 

and so on. 

One must get used to this notation and operate on it quite freely. 
Here we give several examples of such operations. Suppose we have 
the functions y = f (x) — x 3 — 3a; and z = <p (x) = 2x + 1,°" and 
let a be a constant number. Then 

/ (a) = a 2 — 3a (this is the value of the first function for x —’a); 
9 (a 2 ) = 2a 2 + 1 (this is the value of the second function for x — a 2 )- 
f (*£ ) >> ~ (■£ ) a 3a; 2 = a: 4 3a: 2 [this is the value of y assumed 

when x- is substituted for the argument; we thus obtain a new func- 
tion of x which may be denoted as F (a:)]; 
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[/ ( x )p = (x 2 — 3x) 2 = x 4 — 6x 3 -f- 9a: 2 (this is one more func- 
tion of x); 

9 (x -f- a) — 2 (x -f- a) -f- 1 = 2x -f- 2a -f 1 (this is another new 
function of x); 

/ ( x ) cp ( x ) = (x“ — 3x) (2x + 1) = 2x 3 — 5x~ — 3x; 

f (9 (. x )) = [9 (x)] 2 — 39 ( x ) — (2x + l) 2 — 3 (2x + 1) = 

= 4x 2 — 2x — 2; 

9 (J (x)) = 2/ (x) + 1 = 2 (x- — 3x) -j- 1 = 2x 2 — 6x + 1; 

/ (x -}- s) = (x + s) 2 — 3 (x + s) = x 2 -f- 2xs -j- s 2 — 3x — 3.? 
[this is a function of two variables which can he denoted by <t> (x, s)] 
etc. 

In particular, in the above examples we come across the operation 
of composing “a function of a function” or, as it is usually said, 
we deal with a composite function. A composite function is usually 
obtained in the following way. Let a variable y depend on a variable 
u and let u. in its turn, depend on a variable x. Thus, y = / (u) 
and u — 9 (x). Then variations in x change u and therefore y also 
varies. Hence, y is a function of x of the form y = f ( 9 (x)). Thus 
we obtain a composite function. In this case u is an intermediate 
variable. There may also be several intermediate variables. 

If we only want to designate that y is a function of x avoiding all 
the intermediate operations we can write y = y (x). For instance, 
in the examples in Sec. 11 we could have -written S = S (R) or 
p = P ( V , T) and V == V ( T , p). 

13. Methods of Representing Functions. If we intend to investi- 
gate a function, that is a dependence of one quantity upon another, 
the function must be represented in a certain way. There are several 
methods of representing functions. 

The analytical method (i.e. representing a function by a formula) 
is one of the most widely used methods in mathematics. This method 
describes the mathematical operations which should be performed 
on the independent variable to obtain the value of the function. The 
operations are indicated by a formula. For example, the formula 
y = x 2 — 2x says that in order to compute the value of the function 
y we must raise the corresponding value of the argument to the second 
power and then subtract the doubled value of the argument from 
the result. 

The analytical method is compact (i.e. formulas usually occupy 
little space), it can be easily reproduced (i.e. it is not difficult to 
rewrite a formula). Besides, it is the most suitable method for per- 
forming mathematical operations on functions. Here we mean the 
algebraic operations (addition, multiplication and so on), the ope- 
rations of higher mathematics (differentiation, integration and the 
like), and others. But the method is not visual enough (this means 
that when we have a formula it is not always possible to visualize 
the character of dependence of the function upon its argument). 
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The calculation of particular values of a function represented by 
a formula (in case the values are needed) may be a complicated ope- 
ration. In addition, not all functions can be represented by a formu- 
la, and it may be inconvenient to put down a formula even when 
it exists. 

It is sometimes necessary to use several different formulas to 
represent a function on different parts of the range of its argument. 
For example, let a material point fall without an initial velocity 
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on a platform which is moving downwards uniformly with the velo- 
city v, the distance between the point and the platform at the mo- 
ment t = 0 being equal to h. Then the path s covered by the point 
is a function of the time t, i.e. s — f (f). According to Fig. 6 the 
relationship is determined by the formula 
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where t* is the moment of the impact of the point against the plat- 
form which can be found from the equation ^ — h - j- vt. 

The tabular representation of a function gives the numerical values 
of the function for certain discrete values of the argument. Such 
a table may have the following form: 
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-i ac V, f differences x 2 -x v x 3 -x 2 , ... is called a step of 
table, the tables with a constant step are the most convenient; 


is called a step of tli 
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argument x entering into such a table is taken for the values a, 
a -f- h, a + 2 h, .... Here the constant step is denoted by h. The 
well-known tables of logarithms, of trigonometric functions etc. 
are examples of tabular representation of functions. In these tables, 
to save space, the values of the functions are written in lines (like 
words in a book) but not in a single row [as in (2)]. There are many 
.other different tables of various important functions. Some tables 
represent a result of an experiment in which the values of one of 
the quantities are set beforehand and the values of the other quan- 
tities are measured etc. 

The great advantage of the tabular method is that the values of 
a function are already calculated and therefore they can he imme- 
diately utilized. But we sometimes need the values of a function 
represented by a table which correspond to the values of the argument 
that are not included into the table. Then we have to perform some 
additional operations, namely, the operation of interpolation (i.e. 
calculating the values of the function for intermediate values of 
the argument) or the operation of extrapolation (i.e. calculating the 
values of the function for the values of the argument that fall out- 
side the table). These additional calculations often yield incorrect 
results. Tables sometimes occupy much space. Building up a table 
usually requires much work. But in recent years the development 
of modern computer techniques has made it possible to calculate 
tables more quickly. 

The disadvantage of the method is that it is inconvenient for 
performing mathematical operations because each new operation 
can make it necessary to compile a new table which is hard work. 
Besides, this cannot sometimes be done with a desired accuracy. 

The third basic method of representing functions is the graphical 
method. The method represents a function by means of constructing 
its graph which enables us to visualize the character of variations 
of the function. Besides, w'hen we have the graph of a function we 
can easily find approximate values of the function accurate to one 
or two decimal places hut, of course, only in a given range of the 
argument. The construction of a more accurate graph requires much 
effort and yet the accuracy of the values of the function obtained 
by means of the graph may not be sufficient. It should be noted 
that some graphs characterizing an experiment may be drawn by 
a self-recording apparatus. 

The fourth method of representing functions has been widely 
spread in recent years and is now becoming one of the most impor- 
tant methods. This is the method of compiling a program for calcu- 
lating the -values of a given function with the help of an electronic 
computer. We shall discuss the method in Sec. XIX. 6. 

All these methods in a certain sense supplement one another. We 
ofteh come across the problem of passing from one method to another, 
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that is the problem of constructing a graph, of compiling a table 
(the so-called tabulation) or the problem of finding a suitable for- 
mula. Later on we shall discuss the problems of this kind. 

There are, of course, many other ways of representing functions. 
We can sometimes give a verbal explanation of the law of corres- 
pondence between an independent variable and a function. For 
instance, we can say that a tax is such-and-such function of one’s 
income. 

For the first time the definition of the concept of a function close 
to the modern definition was given by the Swiss mathematician 
Johann Bernoulli (1667-1748) . in 1718. But in the 18th century 
functions were usually understood in 
the sense of an analytical formula. 

The modern general concept of a fun- 
ction based on the notion of a law of 
interdependence between variables 
was first introduced by Euler in 
1755 but it became universally recog- 
nized only in the 19th century. 

14. Graphs of Functions. Graphs 
serve for the geometrical representa- 
tion of functions. We shall remind the 
reader of the techniques of constru- 
cting graphs which are known from 
elementary mathematical courses. Let a variable y be a function 
of a variable x, i.e. y = f (x). In order to construct the graph of 
the function we choose two number scales (axes) lying in a plane. 
The x-axis is usually drawn from left to right and is called the 
axis of abscissas and the y-axis is perpendicular to the x-axis and 
is called the axis of ordinates. The origin from which the coordi- 
nates are reckoned is often chosen at the intersection point of the 
axes (see Fig. 7). Then we make the argument take on different 
values and find the corresponding values of y = / (x) which enables 
us to construct the points of the graph. 

The point M shown in Fig. 7 is an arbitrary “moving” (variable) 
point of the graph and it has current coordinates x, y. Practically 
we cannot construct a very large number of points. We then connect 
the points with a curve and thus receive an approximate represen- 
tation of the graph. But theoretically we interpret such a construc- 
tion as if the variable x ran through its whole range; then the moving 
point M should run along the whole graph. Fig. 7 represents an 
example of a graph. We see that in this case the value of the function 
first, increases as the argument x increases; such an increase lasts 
until x approaches, approximately, the value x = 0.5; then the 
function begins to decrease (comparatively slowly) and beginning 
with x = 2 the function increases again with an increasing rate. 
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The units of length and the origins on each of the axes should he 
chosen in such a way that all the most interesting peculiarities of 
the behaviour of the function on the corresponding intervals of the 
ranges of the argument and of the function should be represented 
most clearly. 

For example, let us consider the graph of a function which des- 
cribes a uniformly accelerated motion. Let the law of motion be 

s = 98 + 0.01 1 - (t > 0) (3) 

where t is measured in seconds and s is measured in centimetres. In 
this case we can choose the number scales on the axes in the way 
shown in Fig. 8. It is clear that a change of the position of the origin 



Fig. 8 Fig. 9 


on the axis of the argument or on the axis of the function results, 
respectively, in the parallel translation of the graph as a whole 
along the x-axis or along the ij- axis. If we increase or decrease the 
unit of length along one of the axes then the graph will, respectively, 
expand or contract along the same direction in such a way that 
the distances from all the points of the graph to the other axis will 
increase or decrease the same number of times. For instance, Fig. 9 
represents the form of the graph of the same function (3) after the 
scale for the f-axis has been changed. The new graph is obtained from 
the original one by expansion in the direction parallel to the i-axis. 

In order to represent the behaviour of a function in the best way 
one sometimes uses non-uniform scales which were already mentioned 
in Sec. 4. 

In what follows we shall regard variables (both arguments and 
functions) as dimensionless unless the contrary is explicitly stated. 
In theoretical investigations the simplest thing is to take equal 
units of length for both axes and reckon the coordinates from the 
intersection point of the axes (which is then called the origin of 
the coordinate system) and we shall follow this way. We have already 
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described above in what way changes of scales or of the position 
of the origin affect the form of a graph. _ 

15. The Domain of Definition of a Function. The domain of defi- 
nition of a function is the totality of all the values of the indepem 
dent variable for which the function is defined, that is the admissible 
range of the independent variable (see Sec. 5). We usually consider 
continuous variables and in such cases, as it was pointed out in 
Sec. 5, the domain of definition consists of one or several intervals. 

The structure of the domain of definition of a function is sometimes 
implied by a physical or geometrical meaning of the function. For 
example, if we consider the relation S — nR 2 describing the depen- 
dence of the area of a circle upon its radius the domain of definition 
of the function is 0 < R < oo since the geometrical meaning of 
R implies these very values. In case we consider the dependence of 
the atmosphere density p at a point (lying above a given point of 
the earth’s surface at the height h above sea level) the domain of 
definition of the function p = p (h) is the interval h 0 ^.h ^ II 
where h 0 is the height of the earth’s surface and H is a conditional 
height which is regarded as the limit of the atmosphere. If a function 
is represented by a formula then the set of all the values of the 
argument for which this formula gives a certain real value of the 
function is regarded as its domain of definition (as long as we con- 
sider real junctions of a real argument , that is functions for which 
both the independent variable and the dependent one assume only 
real values). For instance, if y — x 3 then x can take on any real 
values, i.e. the domain of definition is the whole number line 
— oo x oo. U y = Y x 2 — 2 then we cannot obtain real values 
of y while extracting the root in case we have x 2 — 2 < 0. Consequently, 
there must be x 2 — 2 ^ 0, i.e. x 2 ^ 2. The last inequality is fulfilled 
for x ^ — }^2 or x f^Y 2; the domain of definition consists of two 
intervals — oo < x ^ — Y 2 and }/~2 ^ x < oo (the domain is 
shaded in Fig. 10). In other analogous cases in order to determine 
the domain of definition of a function we must first find out what 
may prevent us from getting real values of the function and then 
form inequalities (as it was done in the last example when we put 
down x 2 — 2^0) which guarantee the possibility of obtaining 
real values. Then the problem of determining the domain of defi- 
nition is reduced to solving these inequalities. 

If an independent variable is discrete the domain of definition 
of the corresponding function consists of discrete (separate) points. 
For instance, if / (z) = i! = 1 y 2 ; ... x x then x can assume 
only the values 1, 2, 3, . . . .In case a discrete argument takes on 
only integer values, as in the above example, it is usually denoted 
not by x but by the letters n, m, k and the like whereas the values 
f (t)> / (2), . • ., / {n), . . . are denoted as a u a 2 , . . ., a n ... . 
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In such a case we say that there is a sequence; for example, 
a geometric progression of the form 

flj — U, (Zn flg — QQ , • • •, — Q.Q \ . . . 


is an example of a sequence etc. The graph of a function of a discrete 
argument is not a continuous line but consists of discrete points 
(see Fig. 11). 

The range of a dependent variable, that is the set of all the values 
assumed by a function as its argument runs over the domain of 


definition of the function, is called 
the range of the function. For 
example, the domain of definition 
of the function y = x 2 is the inter- 
val — co <; x < oo and the range 
of this function is the interval 
0 ^ y <C oo since in this case y 
assumes only non-negative values. 
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Fig. 11 


The determination of the domain of definition of a function is 
essential for constructing its graph because this domain is just 
a part of the axis of abscissas over or under which the graph is placed. 
We see three simple graphs depicted in Fig. 12; the domains of the 



functions are shaded. It is clear that in case a domain of definition 
consists of several separate parts the corresponding graph also 
consists of several components. 

16. Characteristics of Behaviour of Functions. Now our aim is to 
study the ways of describing characteristic features of the behaviour 
of functions. 

Unless otherwise stated, we shall regard functions in question 
as single-valued, that is we shall suppose that to each value of an 
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independent, variable taken from the domain of definition of a func- 
tion there corresponds only one certain value of the function. Mul- 
tiple-valued functions will be discussed in Sec. 20. 

A function is called increasing (decreasing) if the values of the 
function increase (decrease) as the argument increases. Both in- 
creasing and decreasing functions are called monotonic. When a func- 
tion is not monotonic it is usually possible to indicate the intervals 
of monotonicity of the function on the axis of the argument; the func- 
tion is monotonic on each of such intervals. Between the intervals 
of monotonicity there are often the intervals of constancy of the 
function. For example, in Fig. 13 we see the graphs of an increasing 



function / (x), of a decreasing function cp (x) and of a non-monotonic 
function ip (x); the function ip (x) has the interval of increase — oo «< 

< x < o, the interval of decrease a ^ x < b. the interval of con- 
stancy b x ^ c and the interval of increase c ^ x < oo. 

The condition that a function / (x) increases can be written in 
the following way: < x 2 always implies f (x t ) < / (x 2 ). This 

enables us to perform similar operations on both sides of inequalities 
involving the argument and the values of an increasing function: 
for example, as we know that y = x 3 is an increasing function we 
see that, an inequality of the form a < b implies a 3 < b 3 and vice 
versa. In case a function f (x) is not monotonic the operation of this 
type can be performed on every interval of monotonicity in which 
the function increases. If a function decreases on some interval then 
-Xi < x 2 implies f (xj) >/ (x 2 ). For instance, the function y = x- 
decreases over the interval — oo < x < 0 and increases for 0 < 

< x < oo; hence, a < b implies a- > b- in case b <1 0 and a- <r 6 s 
if « > 0. 

A function is called continuous if a continuous (“graduaT’) change 
of the argument results in a continuous change of the values of the 
iimction (without "jumps'’). If otherwise the function is called dis- 
contmuous and every value of the argument for which the continuity 
( gradualness ) of the change of the function does not take place 
is called the point of discontinuity of the function. (These notions 

J— 0141 
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will be discussed in detail in § III.4.) For example (see Fig. 14), 
the function y = x 2 is continuous over the whole x-axis; the function 

y — — has one point of discontinuity x — 0 (the values of the func- 
tion “approach infinity” as the values of the argument approach the 



Fig. 14 


value x = 0) and is continuous for all other values of x =£■ 0; the 
function y = tan x has an infinitude of points of discontinuity 


X — ± -jj- It, ; 


-It, 


It, 


If a function is defined on both sides of its point of discontinuity 
the graph of the function is also discontinuous and consists of two 
or more pieces (components; see, for example, Fig. 14). 



Fig. 15 

The function y = sin x is an example of a periodic function. 
Namely (see Fig. 15), the behaviour of the function on the intervals 

. . — 4ji ^ x ^ — 2n, — 2jx ^ x 0, 

0 ^ x ^ 2n, 2 ji ^ x ^ An, . . . 

is similar. More precisely, sin (x -j- 2 jx) = sin x. Herelthe sign =s 
is the sign of identity. We write it when we want to underline that 
an equality is an identity (it is also permissible to put down the 
usual sign of equality in such a case). 

The number 2n is called the period of the function y = sin x. 
We also have the identities 

sin (x 4n) = sin x, sin (x + 6n) ~ sin x, 
sin \x -J- ( — 2jt)] == sin x etc. 
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But tlie period is usually understood as a positive number and even 
tbe least number for which the identities hold. Therefore it is 2n 
that is the period oi y = sin x but not 4n, 6it etc. By the way,, in 
order to indicate the fact we sometimes speak about the primitive 
period. 

Generally, a function y — f (x) is called a periodic function with 
period A > 0 if there is an identity of the form / (x -f A) =s / (x). 
Such a function behaves in the same way on each of the intervals 

. . ., a ■ — 2A < x < a — A, a — A < x < n, 
a ^ x ^ a -}- A, aA-A^Zx^.a-l~2si, ... 

where a is an arbitrary number. Therefore in order to investigate 
the function it is sufficient to consider its behaviour on one of the 



intervals (see Fig. 16). The equality / (x -f A) — / ( x ) is illustrated 
in Fig. 16 for one of the values of x. 

A function y = f (x) is called an even function if it does not change 
its value when the sign of the argument is changed, that is if 
/ (— x) s= / ( x ). The examples of even functions are y = x~, y — x c \ 
y — cos x etc. Fig. 17 shows that the graph of an even function is 
symmetric with respect to the axis of ordinates. A function / (x) 
is called odd in case it is multiplied by — 1 when the sign of the 
argument is changed, that is / (—x) = — / (x). The examples of 
odd functions are y = x, y = x 5 , y = sin x etc. Fig. 18 illustrates 
the fact that the graph of an odd function is symmetric with respect 
to the origin of the coordinate system. It should be noted that in 
the general case a function may be neither even nor odd; for example r 
this is the case with the functions y — 1 + sin x, y — 1 — x, y — 2 X . 
y — log x etc. 

17. Algebraic Classification of Functions. Functions represented 
by a single formula (see Sec. 13) are classified depending on the 
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necessary algebraic operations which should be performed on the 
values of the argument in order to obtain the values of the functions. 
If only the operations of addition, subtraction and multiplication 
are used, and also the operation of raising to a positive integral 
power which is a special case of multiplication, the function is 
called a polynomial or an entire (integral) rational function; in 



forming a polynomial arbitrary constant coefficients can be used. 
Examples of polynomials are y = a? — 2x + 3, y — x~, y — 3, 

'y = — JL-f- ]/2r>, y — a x x- — 2 and the like. On the other hand, 

the functions y = x~’° and y — x 3 4- 2 x are not polynomials 
in|the sense of the above definition. Every polynomial is characte- 
rized by its degree which is the highest of the exponents of powers 
of the independent variable entering into the expression of a poly- 
nomial; for instance, the degrees of the above written polynomials 
are, respectively, 3, 2, 0, 3, and 2. 

Rational functions form a wider class of functions: these are the 
functions which involve the additional operation of division. If 
a rational function is not an entire function it is called a fractional 
rational function. An example of such a function is 

2 1 

_ X 2 — 1 ax — ft" 3 
lJ ~~ x 1/2 — 3 x~ 2 -~l 

According to the rules of elementary algebra every rational -function 
can be represented as a ratio of two polynomials after all the summands 
entering into the expression of the function are reduced to a common 
denominator. 

There is a still wider class of functions whose analytical expres- 
sions may involve an additional operation of extracting roots. This 
is the class of algebraic functions. If an algebraic function is not 
rational it is called irrational. An example of an irrational function 

is y = x 1 — -i- + V x* — I- 
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Functions which are not algebraic are called transcendental. 
Examples of transcendental functions are y = sin re, y — ar + 
_L tan x, y = 2*, y — log x etc. We point out that the last two 
functions are transcendental despite the fact that they are some- 
times traditionally considered in elementary courses on algebra. 

All these definitions are automatically extended to functions of 
several independent variables. The only new fact is the definition 
of the degree of a polynomial in several variables: it is defined as 
the greatest of the sums of the exponents of arguments entering 
into the monomials which are the summands in the expression of 
the polynomial. 

For instance, the function / ( x , y) = x*y — xhj" + x is a poly- 
nomial of the sixth degree in x and y. But if we regard y as fixed 
the same function will be a polynomial of the fourth degree in x. 

A polynomial of the first degree and a polynomial of the second 
degree are called, respectively, a linear function and a quadratic 
function. A polynomial of the third degree is called a cubic function 
and so on. These terms are applied for any number of independent 
variables. 

18. Elementary Functions. We first enumerate the basic elemen- 
tary functions studied in elementary mathematical courses: 

y = x 1 (where a is constant) is a power function; 

y — a x (where a is constant) is an exponential function; 

y — logo ^ (where a is constant) is a logarithmic function; 

y = sin x, y = cos x, y — tan x and y = cot x are trigonome- 
tric functions (circular functions); 

y = arc sin x, y = arc cos x etc. are inverse trigonometric func- 
tions (inverse circular functions). 

Elementary functions are all the functions which can be obtained 
from basic elementary functions by means of algebraic operations 
(with any numerical coefficients) and the operation of composing 
a function of a function (see Sec. 12). In view of this definition all 
the algebraic functions are elementary. But very many transcendental 
functions are also elementary, for example, the functions y — x + 
+ log sin x, y = 2 lQ s tan *+ sin K etc. (one may come across very 
complicated expressions of this type). 

The class of elementary functions includes the greater part of 
functions treated in general courses of higher mathematics. As an 
example of a function which is not elementary we can mention 
y = x! (but we do not give now the general definition of the function 
involving non-integral values of the argument. On this question 
see Sec. XIV.17). Many non-elementary functions are widely used 
in special branches of mathematics and its applications. Many of 
these functions have been investigated in detail and therefore the 
traditional classification into elementary and non-elementary func- 
tions may now be considered out of date. 
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19. Transforming Graphs. It often happens that we know the 
graph of a function and it is necessary to construct the graphs of 
some other functions which can be expressed in a certain way in 
terms of the former graph. Here we give several examples of trans- 
forming graphs in this manner. 

Let the graph of a function y = / {x) be given. It is required to 
construct the graphs of the functions z = / (x) + a and u — f (2+6) 
■where a and b are some constants. The values of the quantities z 
and u will be represented on the same axis of ordinates as that 
for y (see Fig. 19). Then we have z = y -f- a for any x and therefore 



Fig. 19 Fig. 20 

the graph of the function z ( x ) can be obtained from the graph oi 
the function y {x) by translating the latter along the y- axis by the 
distance a in the positive direction of the axis in case a >0. Such 
a translation is depicted in Fig. 19 where each of the vertical line 
segments has the length a. As for the graph of the function u ( x ) 
one may think that it is obtained from the graph of y ( x ) by transla- 
ting the latter along the x-axis by the distance b in the positive 
direction in case b >0. But this conclusion is wrong; in fact we 
should displace the graph of y ( x ) by the amount b in the negative 
direction (if b >0) in order to obtain the graph of u (x). Indeed, 
the value u = y is obtained if we take the value of the argument foi 
u which is smaller hy b than the corresponding value of the argumenl 
for y since u — f [(x — b) -f- b] — f [x) = y. Of course, if a <. C 
or b < 0 the corresponding translation should be carried out in 
the opposite direction. On the other hand, when we say, for example, 
“to displace upwards by ( — 3)” we mean, in fact, “to displace down- 
wards by (+3)” etc. Therefore it is permissible to say that we trans- 
late a graph in a certain direction by an amount h no matter whal 
the sign of h is. 
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The graphs of the functions v = kf (x) and w — f ( kx ) are construc- 
ted in a similar way (see Fig. 20). The graph of the function v (x) 
is obtained from the graph of y (x) by the uniform Mold expansion 
of the latter in the direction of the y-axis so that all the distances 
from the points of the graph of y (x) to the x-axis should increase 
k times (in case k > 1). Indeed, the points of the graph of v (x) which 
have the same abscissas as the points of the graph of y (x) have the 
ordinates k times the ordinates of y (x). The graph of the function 
w (x) is obtained from the graph of the function y (x) by the uniform 
contraction of the latter toward the y-axis with the Mold decrease 
(in case k >1) of all the distances of the points of the graph of 

y (x) to the y-axis. Actually, we have w ~ f ( & ■j) =/(*) = 


== y ( x ). Of course, what we have said is literally true if k >1. 
In case 0 < k < 1 the expansion is replaced by the contraction and 

vice versa. But again, when we say, for example, “the ^-^-j-fold 


expansion” we actually mean “the 3-fold contraction” and the like. 
Therefore we can say that we perform a Mold expansion (or contrac- 
tion) without specifying the magnitude of k. In conclusion we re- 
mark that if k < 0 we should additionally apply the operation of 
forming the corresponding mirror images of the graphs of v (x) 
and w (x) (for the function v (x) the mirror image must be taken 
about the x-axis and for the function w (x) the mirror image must 
be taken with respect to the y-axis). 

Now combining the above results we can say that the graph of 
the function y = kf (mx + b) -{- a can be obtained from the graph 
of the function y — f (x) by means of the following transformations 
(performed in succession): the parallel translation along the x-axis 
Iwhich yields the graph of y = / (x + 6)], the contraction [which 
results in the graph of y = / (mx + b)], the expansion [which gives 
the graph of y = kf (mx -f b)] and one more final translation along 
the y-axis resulting in the desired graph of y = kf (mx -f- b) -}- a. 
(If necessary, the corresponding mirror images should also be taken.) 

The same results can be obtained by the corresponding operations 
on the coordinate axes without changing the graph. For example, 
instead of displacing the graph to the right we can translate the 
axes to the left, or, in other words, displace the origin (from which 
x is reckoned) to the left. Similarly, instead of expanding the graph 
and increasing the distances from the x-axis k times we can decrease 
the corresponding unit of length for the y-axis k times. 

We can perform arithmetical operations on functions represented 
graphically. For example, Fig. 21 illustrates the graphical addition 
of two functions: the graphs of / (x) and cp (x) are given and it is 
necessary to construct the graph of their sum. The equal line seg- 
ments lying one above another are shown in heavy lines. 
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•of the axes x and y interchanged (see Fig. 23). Each of the single- 
valued branches is represented by the corresponding half of the 
parabola: the first branch is represented by the upper half and the 
.second one by the lower half. 

The graph of an implicit function may have the form shown in 
Fig. 24. Here we see that the function is single-valued fora: < a 
and for a: > b whereas it is three-valued for Ki<^ When separa- 
ting the function into branches it is natural to regard the arc AB 
ns the graph of the first branch, the arc BC as the graph of the second 
branch and the arc CD as that of the third one. 



In connection with the question of multiple-valued functions 
■discussed above we can note that for some functions to every value 
of the independent variable there corresponds an entire interval 
of values of the function. For instance, the relation between the 
height of a person and his possible weight is an example of such 
a functional relation. Functions of this kind are usually investigated 
in the theory of probabilities (see Sec. XVIII. 16) and they will not 
occur in other chapters of our course. 

21. Inverse Functions. Suppose we are given a function 

y =* f (?) 


20. Imp. now ta ^ e different values of y and find the corresponding 
is defined b’ * s ^ et us °boose the former dependent variable as 
the function.* 1 anc * re ff ar d the former independent variable as a func- 
same functiok acti0 d?eu IUIlctional reIation ) x (y) thus obtained is called 

7,3 m 9 W'oc' , ‘ of the original function y ( x ). It is represented 

sam/ tWtin-n ( 6 ) in which y is now regarded as an independent 

frnirtirm imniiJu x * s re g ar ded as a dependent variable. But we 

out that it is permissible to use different 


represents the functk 


variables in considering one and the same 
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function. Therefore, if we wanted to denote, as we usually did, the 
independent variable by x and the dependent variable by y for the 
inverse function we should simply have to substitute x for y and 
y for x in (6). Hence, using the new notation we rewrite the relation 
which defines the inverse function in the form 

* = / (y) ■ ( 7 ) 

Thus, the inverse function turns out to be represented in an implicit 
form and therefore (see Sec. 20} it is, generally speaking, multiple- 
valued. We can easily establish a condition which guarantees the 




single-valuedness of an inverse function: this is the monotonicity 
of the original function. Indeed, this being so, we obtain a certain 
uniquely defined value x = x (y) for each given value of y (see 
Fig. 25). 

Examples. The inverse function of the function y — x z is defined 
by the equality x = y 3 , that is y = y/ x\ the inverse function of y = x“ 
is the two-valued function y — ±Y x. 

Equalities (6) and (7) differ only in the interchange of the notation 
of the quantities x and y, that is in the interchange of their roles. 
Therefore we see, as it is shown in Fig. 26, that the graph of the 
inverse function is obtained as. a mirror image of the graph of the 
original function about the bisector of the angle between the coor- 
dinate axes (the bisector is represented by the dotted line in Fig. 26). 
The points M and M' in Fig. 26 both correspond to one and the same 
equality of the form b = / (a). 

We remark in conclusion that if the function x {y) is the inverse 
function of the function y (x) then, conversely, the latter is the 
inverse function of the former. 
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§ 4. Review of Basic Functions 

Many of the functions which we are going to discuss here are stu- 
died in elementary. mathematical courses. We shall consider them 
here because of their significance. 

22. Linear Function. The general form of a linear function (see 
the end of Sec. 17) is 

y = ax + b (8) 

where a and b are constant coefficients. 

The graph of a linear function is a straight line (see Fig. 27). The 
coefficient a is called the slope of the straight line. The greater | a | 
(i.e. the greater a in its absolute value), the steeper the slope of the 
straight line (with regard to the z-axis). If the argument of a func- 
tion changes from a value x 0 to a certain value x it receives an in- 
crement At (which is equal to x — .To)*" Then the function recei- 
ves the corresponding increment 
Ay [which is equal to y — y 0 = 
= f (x) — / (t 0 )]. In our case 
A y y — f ( x ) = ax -f- b and therefore 

y o = ax o + b and y = ax b. 
Consequently, y — y 0 — a (x — x 0 ), 
i.e. Ay — a Ax. This implies 

= a (if At =jt= 0) (9) 

Thus, the ratio of the increment 
Fl S- 27 of a linear function to the incre- 

ment of the argument is constant 
and equal to the slope of the graph. The increment of a linear 
function is directly proportional to the increment of the argument. 

In Fig. 27 the case when a >0 is shown. If a < 0 the straight 
line is drawn downwards to the right (see Fig. 28). In case a = 0 
the straight line is parallel to the T-axis; in this case the function 
is constant and thus we obtain the graph of a constant. 

The property of the increment of a linear function forms the basis 
for the so-called linear interpolation which is used even in elemen- 
tary mathematical courses. The idea of this method is the following. 
Suppose we know the values of a function y — f { x) (its graph is 
depicted in Fig. 29 by the dotted line) for x = t 0 and for t = t 0 -j - h: 

f (*o) = I/o, / (x 0 + fy = i/j 


* The Greek letter A (delta) is used to denote an increment. The symbol Ai 
should be regarded as an indivisible symbol and by no means as the product 
of A by x. An “increment” is understood in the algebraic sense, i.e. it can be 
positive, negative or equal to zero. 
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3)ut the intermediate values of the function for the values of x lying 
between x = x 0 and x = x 0 -f h are unknown. Then we approxi- 
mately replace the given function by a linear function which assu- 
mes the same values for x = x 0 and for x — x a '-f h, that is we 
replace the arc u AE by the straight line segment AE. The simila- 
rity of the triangles ABC and ADE then implies 

y—vo __ ’Ji—yo 

X — Xg h 

i.e. 

y=Uo + - 1 ^ - 9 - (x-Xo) 

Such a replacement is possible in case the function / ( x ) slightly 
differs from the linear function on the interval between xg and 
x 0 + h. The interpolation method is widely used, in particular, 



for tables with a sufficiently small step when the successive values 
of the function differ slightly from each other. More precise methods 
of interpolation will be discussed in Secs. V.6-8. The linear extra- 
polation (see Sec. 13) is performed in like manner. 

Formula (9) and Fig. 27 imply that a — tan q>, i.e. the slope of 
■a straight line is equal to the tangent of the angle of inclination of the 
line to the axis of abscissas. 

If the quantities x and y have certain dimensions the slope also 
has a dimension. Formula (8) shows that [hi = [y] and [ax] = \y) 

which implies [a] = ||| (the dimensions of coefficients entering 

into other formulas can be determined similarly). The geometrical 
meaning of the slope can be easily interpreted in the general case: 
if l x units of length for the z-axis correspond to the unit measure 
of the quantity x and l v units of length for the y-axis correspond to 



62 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


the unit measure of the quantity y ( l x and l y are the so-called scale 
factors) then the sides AB and BC of the triangle ABC in Fig. 27 
are of lengths l x Ax and l v Ay, respectively. Consequently, 

ty A y Aw Z 

taD< P = -£AT and a== = (!°) 

i.e. the slope is proportional to the tangent of the angle cp. 

23. Quadratic Function. The general form of a quadratic func- 
tion is 

y = ax 2 -f- bx + c 

From elementary mathematical courses it is known that tfie graph 
of a quadratic function is a parabola. In the simplest case when a — 1, 
6 = 0 and c — 0, i.e. y = x~, the graph has the form depicted in 
Fig. 30. Then the function is even and the y-axis is the symmetry 
axis of the graph (the axis of the parabola). The intersection point 




of a parabola with its axis is called the vertex of the parabola. The 
vertex in Fig. 30 is placed at the origin of the coordinate system. 

In the general case when a 0, b and c are arbitrary numbers 
the parabola is obtained from the parabola depicted in Fig. 30 by 
the operations of uniform expansion and parallel translation. To 
determine the position of the vertex we can apply the so-called me- 
thod of completing a square which we shall demonstrate here by 
considering a concrete numerical example. Let the quadratic func- 
tion y — 2z~ — ?jx -f- 1 be given*. Then we perform the following 

* In practice we usually have quadratic functions (trinomials) whose coeffi- 
cients are approximate numbers (in contrast to the above trinomial with exact 
coefficients). For instance, we can take the trinomial y = 2.17a: 2 — Z.2lx + 0.84 
and the like. But if we investigate the case with exact coefficients we can 
easily pass to a more complicated case. The comment also refers to further exam- 
ples of this type. 



VARIABLES AND FUNCTIONS 


63 - 


simple transformations: 


!) ! -( 4 ) ! + 4 ]= 2 ("- 


3_\2__ JL_ 
4 / 8 


Consequently (see Sec. 19) we obtain the sought-for graph ^from 
tlie parabola depicted in Fig. 30 by translating the parabola unit. 


of length to the right, 
expanding it along the 
direction of the y-axis with 
the two-fold increase of the 
distances from the points 
of the graph to the z-axis 
and, finally, by transla- 
ting 4- unit downwards. The 

O 

graph thus obtained is 




Fig. 32 Fig. 33 

shown in Fig. 31. To construct a more accurate graph we can- 
additionally take several values of x and determine the cor- 
responding values of y which enables us to construct the corres- 
ponding points of the graph. For instance, we have y = 1 for 
x = 0. y = 0 for x — 1 and y — 3 for x = 2; the corresponding" 
points are indicated on the graph. The vertex of the constructed 

O 

parabola is situated at the point M with the coordinates & 

and y = — ~ . The parabola is “narrower” than the one depicted in 

Fig. 30 (with the same unit of length). 

Generally, the greater | a |, the narrower the parabola. If a < 0 
the branches of the parabola are open downwards (see Fig. 32). In 
case a — 0 the quadratic function turns into a linear one. 

24. Power Function. The general form of a power function is 

y = x n 
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If 0 < x < 1 then the greater n, the smaller the values of the func- 
tion. But if x >1 then the greater n , the greater the values of the 
function. In Fig. 33 we see the graphs for n = 1, 2, 3 and 4. While 
■constructing the parts of the graphs of y = x n for the values x < 0 
one should take into account that the function y — x n is even for 
even n and odd for odd n. In particular, let us consider in detail 
the graph of the function y — x? (the cubic parabola). The graph 
is convex upwards (or concave downwards) for x < 0, that is it lies 
under the tangent drawn at any of its points. For x >0 the graph 



is convex downwards (or concave upwards). If we pass from left to 
Tight through the origin of the coordinate system the direction of 
convexity changes to the opposite one. The tangent to the graph 
at the origin coincides with the x-axis but at the point of tangency 
O the curve passes from one side of the tangent line to another. Such 
points are called the points of inflection of a curve. Thus, the cubic 
parabola has one point of inflection. Among the well-known curves, 
the sinusoid, for example, has points of inflection. For fractional 
values of the exponent n the graphs are placed between the corres- 
ponding graphs for integral values of n. But in the case of a fractio- 
nal n one should be careful when constructing graphs: a negative 
number raised to a fractional power may result in an imaginary 
number and therefore in such a case we must not construct the 
■graph for x < 0. 

Let us consider the case 0 < n < 1. For instance, let n — , 

i _ 

that is y — x 2 — V x. Then, as it was shown in Sec. 20, the graph 
is the upper half of the ordinary (quadratic) parabola with the sym- 
metry axis coinciding with the x-axis (see Fig. 34). The graphs of 
power functions for some other fractional n are also shown in Fig. 34. 
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In case the fraction representing n has an odd denominator the graph 
exists not only for x >0 but for x < 0 as well because, for negative 
numbers, we can extract real roots with odd indices of radicals. In 

particular, let us take the graph of the function y — x* (the semi- 
cubical parabola) depicted in Fig. 35. The graph first approaches 



Fig. 35 

the origin of the coordinates (for example, when we pass from left 
to right) and then departs from the origin. At the origin this curve 
has the so-called spinode, or cusp. Later on we shall investigate 
some other curves having cusps. 

Finally , let us take the case of a negative n (?z = — m •< 0) . 

I Then y — -j- and therefore we have very large values 


!\i of | y | for very small | x j and vice [versa. The corres- 



Fig. 36 Fig. 37 


ponding graphs are shown in Fig. 36 for x > 0; we leave to the 
reader the construction of the parts of the graphs corresponding to 
x < 0. All these graphs stretch along the coordinate axes and 
approach them unlimitedly as x or y approaches infinity. Geno- 
rally, when a curve and a straight line have a mutual disposition 
of such a kind the straight line is called the asymptote of the curve. 
Hence, each of the above graphs has two asymptotes which are the 
coordinate axes. 

5-0141 
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One must not think that a curve cannot intersect its asymptote 
in all other cases. For example, investigating the so-called damped 
oscillations we obtain a graph of the form shown in Fig. 37. Here 
the z-axis is the asymptote of the graph which intersects it 
infinitely many times. 

25. Linear-Fractional Function. A linear-fractional function has 
the general form 


y 


ax + b 
cx-j-d 


( 11 ) 


In the simplest case when a — d = 0 we obtain, denoting — ~ k, 

the expression y = — which describes the inverse proportional 
relation. The corresponding graph, as is well known from elementary 




mathematical courses, is called a hyperbola. The graph is depicted in 
Fig. 38 for the two cases k >0 and k *< 0 separately. The function 

V = — being odd, the hyperbola has a centre of symmetry (which 

is located at the origin of coordinates in Fig. 38). It has two asympto- 
tes (which are the coordinate axes in Fig. 38). We shall prove in 
Sec. 11.13 that a hyperbola has two symmetry axes (for the hyper- 
bola in question the axes of symmetry are the bisectors of the angles 
between the coordinate axes in Fig. 38). 

In the general case the graph of a linear-fractional function is 
also a hyperbola which can be obtained by a parallel translation 
of the hyperbola depicted in Fig. 38. We shall illustrate this by 

taking a concrete numerical example. Let y — . We now 
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carry out the following simple transformations: 



Thus we conclude that the desired graph can be obtained from 

19 

IT 5 

the graph of the function y — — by translating the latter — units 



2 

of length to the right and units upwards. Hence, we get a hyper- 

bola whose centre of symmetry is at the point x — — , y = — (see 
Fig- 39). . 2 3 

A linear-fractional function of general form (11) has a point of 

discontinuity at x = — ^ because the denominator vanishes at 

the point. This accounts for the fact that the graph consists of two 
separate portions (see Sec. 16). 
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26. Logarithmic Function. A logarithmic function is a function 
•of the form 

y = log a x ■ (12) 

It is defined only for x > 0, and we consider the bases of logarithms 
a > 0 (a ■=£ 1). The graphs of logarithmic functions are shown for 
different bases in Fig. 40. They have neither symmetry axes nor 
centres of symmetry but have an asymptote which is the y-axis. 
All the logarithmic functions are proportional to each other since 
taking logarithms to the base b of both sides of the equality a log a* = 
= x we get 

logb x = log a rc-logi, a = A: logo x (k = log b a = (13) 

Therefore we can obtain all the graphs depicted in Fig. 40 by 
expanding or contracting one of them along the direction of the 

p-axis with a uniform increase or, 
respectively, with a uniform dec- 
rease of the distances of the points 
of the graph from the x-axis. Now 
let us consider the angles of inter- 
section of the graphs with the 
x-axis. According to the general 
definition of the angle between inter- 
secting curves as the angle between 
the tangents to the curves at the 
point of their intersection, we mean 
here the angles formed by the tan- 
gents to the graphs with the x-axis. 
When the graph is expanded or 
contracted in the way described 
above the tangent rotates about 
the"point of intersection. We see that the tangent has a very slant 
inclination (to the x-axis) for very large values of a and a very 
steep inclination for values of a close to 1. For a certain value of a 
the angle of intersection of the graph of logarithmic function (12) 
with the x-axis is equal to 45°. This value of a is denoted by the let- 
ter e. It plays an important role in mathematics as we shall see later. 

We see in Fig. 40 that the angle of intersection is greater than 45° 
for a = 2 and smaller than 45° for a — 4; hence, the number e 
lies between the limits 2 and 4. More accurate calculations which 
will be described in Sec. IV. 16 show that e = 2.71828 with an 
accuracy of 10 -5 . The notation e for this number was introduced by 
Euler. 

Logarithms to the base e are called natural logarithms [Napierian 
logarithms after the Scottish mathematician J. Napier (1550-1617)]. 
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They are denoted as In x = log e x. The graph of the natural loga- 
rithm is shown in Fig. 41. A logarithm to any other base can he 
expressed in terms of the natural logarithms in accordance with 
formula (13): 


loga X = 


In z 
In a 


(14) 


Hence, the formulas for passing from the common (decimal) loga- 
rithms to the natural ones and vice versa are 

log x = 0.4343 In x and In x — 2.303 log x 

where the values of the proportionality factors are accurate to four 
decimal places. 

Besides the natural logarithms we also use the common logarithms 
(in numerical calculations) and the logarithms to the base 2 (in 
information theory and some other branches of modern mathematics). 



27. Exponential Function. An exponential function is a function 
of the form 

y = a x (15) 

The function is defined for all x, and we always consider the values 
a > 0 (because raising a < 0 to a fractional power may result in 
an imaginary number). Equality (15) can be obtained from formu- 
la (12) if we solve it for x (which yields x = a v ) and then inter- 
change x and y. Consequently (see Sec. 21) the exponential function 
and the logarithmic function are the inverse functions with respect 
to each other. 

Therefore the graphs of exponential functions which are depicted 
m Fig. 42 for different bases a are obtained as the mirror images of 
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the corresponding graphs (shown in Fig. 40) with respect to the bi- 
sector of the angle between the coordinate axes. If a > 1 the expo- 
nential function is an increasing function, and the greater a, the 
greater the rate of its increase. In case 0 <; a ■< 1 the exponential 
function is a decreasing function. 

We often deal with the exponential function with a — e. In this 
case there is special notation: y — e* — exp x. 

Any exponential function with an arbitrary base a can be reduced 
to the base e; indeed, the definition of a logarithm implies that 
a _ e in a anc i therefore a x = (e In °) x = e kx where k — In a. 

28. Hyperbolic Functions. The hyperbolic sine, cosine and tan- 
gent are, respectively, the functions 

. . e x — e~ x , e x -~-e~ x , , sinh x e x — e~' x 

sinhx = x — , cosh 2 = ^ . tanhx = r — — - — - 

2 2 cosh x e x -\-e 1 


At first these terms sound strange hut their genuine sense (for exam- 
ple, the connection between sin x and sinh x or between the hyperbolic 
functions and a hyperbola) will be explained only when we get 
to Secs. VIII. 4 and XIV.8. Let us now establish some formulas 
connecting these functions. Squaring the first two equalities we get 


_ pIX 9 -i_ p~2x 

sinh 2 x = ; and 


cosh 2 x - 


e 2x -p 2 + e~ 2 


4 4 

Now subtracting and adding these two formulas we obtain 


cosh 2 x — sinh 2 x— 1, cosh 2 x -f sinh 2 x — 1 


: + e-‘ 


Zx 


cosh 2x 


The obtained formulas indicate a significant analogy between hyper- 
bolic and trigonometric functions. We leave to the reader the de- 
duction of the formulas 


sinh 2x = 2 sinh x cosh x. 


sinh (a -f b) = sinh a cosh b -j- cosh a sinh b. 


1 


— tanh 2 x = 


1 

cosh 2 x 


and other similar formulas. Note that sinh 0 = 0 and cosh 0 = 1. 
The functions sinh x and tanh x are odd whereas the function cosh x 
is even; indeed, for example, 


sinh ( — x) 


C (-X)_ g-(-X) 
2 




— sinhx 


The construction of the graphs of sinh x and cosh x is illustrated 
in Fig. 43. The graph of tanh x is shown in Fig. 44. To construct 
this graph we find its points with the help of the graphs of sinh x 
and cosh x. It is clear that the hyperbolic functions do not possess 
the most important property of trigonometric functions, namely, 
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the periodicity. Besides, the range (see Sec. 15) of each hyperbolic 
function considerably differs from the range of the corresponding 
trigonometric function. 



Fig 

The graph of tanh x has two as 
I x |, we have e"l*l < 1 <C e |j: 
smaller'’) and therefore tanh x « 
tanh x « —1 for large | x | and a 

y. 

. 43 

ymptotes since, for large values of 

1 (the sign < means here “much 
1 for large | x | and x > 0, and 
: < 0. 

1 

/ 

y=tanh x 


0 




Fig. 44 

We sometimes consider the inverse hyperbolic functions which 
are denoted as sinh -1 x, cosh -1 x and tanh -1 x, respectively. Figs. 43 
and 44 show that the first and the third functions are single-valued 
(compare with Fig. 25) whereas the second one is two-valued. All 
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these functions can be expressed in terms of logarithms. In fact, let; 
for example, y = sinh -:i ;r. Then, by the definition of an inverse 
function, we have 

. , eV—erV 
x — smh y — — g — 

i.e. 

e v — e~ v — 2x = 0, e iv — 2xe v — 1=0 

which implies e v = x ± Y x 2 -j- 1. The left-hand side being positive, 
the right-hand side should also be positive. Therefore we can take 
only in front of the radical. Now taking logarithms we obtain 

y — sinh -1 x == In (x Y x2 + 1) (16) 

29. Trigonometric Functions. The function y = sin x with 
period 2n is well known from courses on trigonometry. Its graph 
(the sinusoid, the sine curve) is represented in Fig. 45. The function 



is odd, has no points of discontinuity and is bounded (its values 
lie between the limits — 1 and +1). We have cos x = sin ^ x + 
and therefore the graph of the function cos x is the same sinusoid 
but translated units of length to the left; this graph is also shown 

in Fig. 45. In many applications we encounter a sinusoidal, “harmo- 
nic” relation of the form 

y — M sin (c at + a) (17) 

where the independent variable t is interpreted as time, the con- 
stant M > 0 is called an amplitude and co > 0 is called a frequency 
(circular, angular frequency). The sum cot -j- a is called a phase 
and the constant a is an initial phase which is obtained from the 
phase by substituting t = 0 for t. We can easily investigate in what 
way parameters M, © and a affect the form and the disposition 
of the sinusoid (compare with Sec. 19). The amplitude M increases 
the range of the sinusoid and brings it from — M to M, the frequency 
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w changes the period 2n into T — — and the presence of the initial 

phase’ a displaces the sinusoid to the left by the distance [since 

o)t-{-a = (o and. therefore the value 2- is added to 

the argument]. The graph thus obtained is represented in 
Fig. 46. 

A function of form (17) is obtained, in particular, when we trans- 
form the expression A cos cot + B sin at. The right-hand side 
of (17) can be rewritten in 
the form M sin a • cos cot + 

4 -M cos a • sin cot and thus 
in order to obtain the 
equality 

A cos at -f- B sin at s= 

= M si n (at + a) (IS) 

we must have A = M sin a 
and B — M cos a. From 
this it is easy to find M 
and a: M = -/A z -f-5 2 and 

tan a = ~ ; the quarter in 

which a should be taken is 
defined by the signs of sin a and cos a, i.e. by the signs of A and B . 

In case the independent variable is interpreted not as time but 
as a geometrical coordinate the sinusoidal relation is usually written 
m the form y = M sin (kx -f a) instead of (17). In this case k 

is called a ■wave-number and X = is a wave-length. 

The function y — tan x has the period jt since tan (x -f- n) == 

S3 tan a;. It has the points of discontinuity at x — ~ , -f- ji r 

a 

T ~ . (this can be written in the general form as x = + kx 

vhere k — 0. ±1. ±2, . . .). Indeed, at these points cos a: = 0 
ana therefore tana; — ±oo. The graph of the function (the tangent 
curve) is represented in Fig. 47; it consists of an infinitude of similar 
components and has infinitely many asymptotes. The graph of the 
Junction y ~ cot x is also shown in Fig. 47. We have 



Fig. 46 
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and therefore the curve has the same form hut its disposition is 
changed in the corresponding way. 

The function y — Arc sin x is the inverse function of y = sin x 
and therefore its graph (see Fig. 48) is the mirror image of the graph 
of y — sin x about the bisector of the first quadrant angle. This 
function is multiple-valued (more precisely, infinite-valued) and 



Fig. 47 Fig. 48 


therefore (see Secs. 20-21) one usually considers its principal branch 
(the principal value of the arc sine) which is shown in Fig. 48 in 
heavy line; this branch is denoted as 

!/ = arcsinx, — |--^arc sin 

and is a single-valued function.' Other branches of the function have 
no special names. 

The functions y = Arc cos x and y = Arc tan x can be investiga- 
ted in a similar way and we leave this to the reader. 

We should note in conclusion that we shall always deal with 
■dimensionless (abstract) values of arc sin x. For example, we have 

Jt 


= 2 A = 2 1 


Similarly, the values of the function y = sin x are taken for di- 
mensionless values of x. We mean here that the sine of a number x 


* The last two equalities are approximate. If one intends to stress this 

31 

fact one writes 2 2 « 2 1,57 . tVe are not going to mention stipulations of this 
hind in the future. 
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is the sine of the angle of x radians. For instance, sin 1 = sin 57°18' — 

30. Empirical Formulas. We have already mentioned (see Sec. 13) 
that an experiment often results in a function y = f (x) which we 
are interested in and represents the function in a tabular form (2). 
In such a case the problem of selecting an appropriate empirical 
formula for the function may arise. We usually begin with repre- 
senting the values of the function on the graph paper or some other 
appropriate paper. Then we select a certain form of the formula we 
are going to use. If the form is not implied by general considerations 
we usually choose one of the functions described in Secs. 22-29 or 
a simple combination of such func- 
tions (a sum of power functions }f 
or of exponential functions and the 
like). In order to select such a for- 
mula in the best way one must know / 
the graphs of these functions well. 

When selecting a function we must 
try to achieve the resemblance bet- / 
ween the characteristic peculiarities 
of a sought-for function tp (x) and _ 
of the function / (x) under consi- 0 
deration. For example, if the phy- 
sical meaning of the function indi- Fig. 49 

cates that / (x) is even and / (0) = 0 

then the function qp (x) should also have these properties and so on. 
It sometimes turns out that we cannot find a single formula for the 
whole interval of x. Then it is necessary to divide the interval into 
several parts and select, for each of the parts, its own appropriate 
formula. 

After the form of the formula has been chosen it is necessary to 
determine the values of parameters entering into the formula. 

For example, suppose that after plotting the points we obtain 
the drawing shown in Fig. 49. If we have certain reasons to suspect 
that the experiment or the calculations of the values of the function 
could contain essential errors we must simply discard the points 
which fall out of the general form of the relationship described by 
the data represented in our drawing. For instance, the point P 
in Fig. 49 is a point of this kind. By the way. such points may some- 
times indicate that certain important factors were not taken into 
account and then, of course, we must pay much attention to them. 

The remaining points in Fig. 49 resemble a linear relation of the 
form y = ax + b. In order to determine the parameters a and b 
let us draw a straight line such that the experimental points should 
lie as close as possible to the line. This can easily be done by means 
of a transparent ruler. We apply the ruler to the drawing and then 
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approximately find the sought-for position of the ruler. For example, 
the straight line drawn in Fig. 49 yields b — 0.50 and a = — = 
= 0.58, i.e. y — 0.58x + 0.50. 

The selection of a linear relation described above is comparatively 
simple. Therefore when choosing some other kind of functional 
relation one often tries to introduce new variables so that there 
should be a linear relation between the new variables and then to 
determine the parameters entering into the linear relation. We can 
apply this method only if there are no more than two such parameters 
since a linear function contains two parameters. 

For example, let an experiment yield the following table of values: 


X 

0.00 

0.10 

0.20 

0.30 



0.60 

0.70 

0.80 

0.90 

1.00 

V 

0.00 

0.01 

0.03 

0.08 

0.17 



0.66 

0.91 

1.22 

1..57 


We leave to the reader to represent the experimental points on 
the graph paper. The disposition of the points thus constructed 
resembles the graph of a power function of the form y = ax a . In 



Fig. 50 

order to determine the parameters a and a we take logarithms of 
both sides of the equality and denote log y = Y, log x = X and 
log a — A. Then we arrive at the equality Y — aX -j- A and thus 
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we see that there is a linear relation between the new variables. 
By means of a table of logarithms we compile the table of the values 
of the new variables: 


I 

- 1.0 

- 0.70 

- 0.52 


- 0.30 

■ 

- 0.15 

- 0.10 


0.00 


■ 

■ 

— 1.1 

— 0.77 

■ 

D 

- 0.18 

- 0.041 

0.086 

0.196 


The points thus obtained are lying close enough to the straight 
line drawn in Fig. 50. In drawing the straight line we should 
pay more attention to the last three points whose positions are 
determined with greater accurac 3 r . Our construction yields the values 
A = 0.196 and a = 2.44, that is a — 1.57. Hence we finally obtain 
y = 1.57s 2 44 . 

Some further rules and examples see in [51 1. 



















CHAPTER II 


Plane Analytic Geometry 


Analytic geometry is a branch of mathematics in which geometri- 
cal problems are investigated on the basis of the coordinate method 
by means of algebraic techniques. 


§ 1. Plane Coordinates 


1. Cartesian Coordinates. Cartesian coordinates are known from 
elementary mathematical courses and we have already used them 
(see Chapter I). Cartesian coordinates are called after R. Descartes. 

R. Descartes and P. Fermat 



Fig. 51 


(1601-1665) are the founders 
of the coordinate method. 

Several points are depi- 
cted in Cartesian coordina- 
tes in Fig. 51. It should he 
noted that we take mutual- 
ly perpendicular coordinate 
axes here and that the unit 
of length is the same for 
both axes. The origin of the 
coordinate system is placed 
at the point of intersection 
of the axes from which the 
distances along the axes are 
reckoned. (As it was men- 
tioned in Sec. 1.14, when 


we construct graphs, we can sometimes take different scales for the 
axes and change the position of the point the coordinates are recko- 
ned from.) Each point in the coordinate plane has certain uniquely 
determined coordinates and, conversely, to each ordered pair of coordina- 
tes x and y there corresponds a certain uniquely determined point of 
the plane. This basic property makes it possible to consider the 
coordinates of points instead of the points themselves. 
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The coordinate axes break the plane into the quarters (quadrants) 
which are numbered in the way shown in Fig. 52. Each of these 
quadrants is characterized by its specific combination of the signs 
of abscissas and ordinates; this is also shown in Fig. 52. 

2. Some Simple Problems Concerning Cartesian Coordinates. 

(1) The distance between two given points. Let the points Mi (x u z/j) 
and Mi (x 2 , y 2 ) be given (i.e. their coordinates are known). It is 
required to determine the distance d = MiM« (see Fig. 53). The 




formula for the distance is implied by Pythagoras’ theorem applied 
to the rectangular triangle M { M 2 P. Thus we have MiM\ — MJ? 2 -f- 
+ PMl i.e. d 2 = (x 2 — xi ) 2 + (y 2 — y { ) 2 , or 


d = V(z* — xi) 2 -f (y„ — iji) 2 (1) 

This formula and all the following formulas hold for any two points 
Mi and M 2 placed in an arbitrary manner in the coordinate plane. 

(2) Division of a line segment in a given ratio. Let some points 
Mi (x u 7 / 1 ) and M 2 (x 2 , 7 / 2 ) be given. It is required to find a point 
M (x, y) lying on the segment M t M 2 such that the ratio of division 

should be equal to \ where X is a given number (see Fig. 54). 
The solution of the problem follows from the similarity of the tri- 
angles MiPM and MQM t which implies = X, i. e . 

= % and x — Xi — Xx 2 — lx. From the last relations we 
deduce 


X 1 ~l~ 1'‘ x 2 „ yi + ^J/2 

1+?. ’ y l+T 


( 2 ) 


(the expression for y is obtained in like manner). In particular, in 
the case X = 1 , that is when the segment is halved, we have 


2 


y = 


Vi 4 " Uz 


x = 


2 
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(3) Transformation of coordinates without changing the scale. Suppose 
we have an “old” coordinate system x , y. Now let a “new” coordinate 
system x ' , y' be introduced. It is required to establish the relation- 




Fig. 54 


Fig. 55 


ship between the old coordinates and the new ones. We shall con- 
sider the following three cases. 

I. Let the new coordinates be the result of a translation of the 
original ones. Suppose the new origin of the coordinate system has 


y | 

y' 

M 

x' o\ 

X 

’x 


Fig. 56 



the coordinates (a, b) with respect to the original coordinate system. 
Then, as it is implied by Fig. 55, we have 

x — x' -f a, y = y' + b 

II. Suppose the new coordinate axes are the mirror image of the 
old ones (for example, the mirror image about the p-axis). Then, 
as it is seen in Fig. 56, the relationship is 

x = —x', y ~ y' (3) 

III. Let now the new axes be the result of turning the old axes 
about the origin of the coordinate system through an angle a (see 
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Fig. 57 ). Then the equalities OC = OD — AB and CM — DB + AM 
imply 

x = x' cos a — y' sin a 
y = x' sin a 4 - y' cos a 



The general case of passing from one Cartesian system to another 
is a mere combination of the above basic cases. 

3 . Polar Coordinates. Besides Cartesian coordinate systems we 
can construct many other different coordinate systems, that is there 




are many ways to determine the position of a point in a plane by 
means of two numerical parameters (coordinates). An appropriate 
system should be chosen for each occasion, Cartesian systems being 
applied more often than others. In this section we shall consider 
only one of the systems, namely, the 
so-called polar coordinate system which 
is particularly convenient for inve- 
stigating rotary motion. In order to 
construct polar coordinates we arbitra- 
rily choose a pole O and a polar axis Op 
(see Fig. 58). Then the position of 
a point M can be characterized by its 
polar radius (radius-vector) p (i. e. 
the distance from 0 to M) and its 
polar angle (vectorial angle) qp (which 
is also called the phase or the amplitude of the point M). p and qp 
are called the polar coordinates of the point M. The vectorial angle 9 
is considered positive if it is reckoned in the positive direction 
(which is usually taken counterclockwise) and negative if other- 
wise. Several points with given polar coordinates axe constructed in 
Fig. 59. In particular, we see that the pole has the zero radius- 
vector while its vectorial angle is undetermined and can be chosen 
in an arbitrary way. In order to describe all the positions a point 
can occupy in a plane it is sufficient to take the angle qp within the 
limits — 180° < 9 ^ 180° but it may sometimes be convenient 
to consider the values of 9 which fall outside the interval. The 

6—0141 



82 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


addition of 360° to the vectorial angle of a point does not change its 
position. 

The relationship between the Cartesian coordinates and the polar 
coordinates of a point in case the systems are placed as it is shown 
in Fig. 60 has the form 

x = p cos (p, y = p sin tp and, conversely, 

p = V x l -f if, tan ( P — ~ (5) 

§ 2. Curves in Plane 

4. Equation of a Curve in Cartesian Coordinates. As it was shown 
in Sec. 1.20, an equation of the form 

F { x , y) = 0 (6) 

defines a curve ( L ) in the x, //-plane (i.e. in the plane where the 
coordinate system is taken). The curve is the locus of all the points 
whose coordinates satisfy equation (6). The relation of form (6) 

is called the equation of the curve (L). 

Conversely, if a curve (L) is given in 
the x, //-plane then its geometrical pro- 
perties can be expressed in terms of 
some analytic relations between the coor- 
dinates of its points, and thus an equation 
of the curve ( L ) is obtained in form (6). 
(It should be taken into account of course 
that every equation can be rewritten in 
different equivalent forms.) Consequently, 
the coordinate method enables us to con- 
sider the equations of curves instead of 
the curves themselves. Thus, geometrical 
problems can be reduced to algebraic 
ones, and the latter, as a rule, can be 
solved in a simpler and more uniform way than the former. For 
example, in order to verify whether a curve with an equation of 
form (6) (or, as we simply say, “the curve F (x, y) — 0”) passes 
through a point (a, b) it is sufficient to substitute the coordinates 
of the point into the equation of the curve and verify whether the 
equation is satisfied, i.e. whether w r e have F (a, b) — 0. 

As an example, let us deduce the equation of a circle (see Fig. 61). 
Let the centre A of the circle have the coordinates (a, b) and let 
M (x, y) be an arbitrary (moving, current) point of the circle. Then 
the basic property characterizing a circle can be written as AM = R 
where R is the radius of the circle. Now, applying formula (1) for 
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the distance between two points we obtain 

V (x — af + (y — bf = R 

or, squaring, we derive 

(x - af + (y - bf = i? 2 

This relation is the one which is satisfied by the coordinates of all 
the points of the given circle and at the same time only the points 
belonging to the circle may satisfy this relation. Thus, this relation 
is an equation of the circle and a, b and R entering into the equa- 
tion are some fixed numbers (i.e. parameters which characterize 
the position and the size of the circle) whereas x and y are the current 
coordinates of a variable point of the circle. 

Now, contrary to the previous example, let an equation be ori- 
ginally given, for example, the equation 

a; 2 + y n “ — 3® -f- Ay — 1 = 0 (7) 

Transforming the equation and completing the squares we obtain 

(*-4) 2 -(f) 2 + (!/+2) ! - 2! -l = 0, 
(,-4) 2 +te+ w-?-o 


Consequently the given equation is the equation of a circle with 
centre at the point (1.5, —2) and of radius ~\/~ = 2.69. 

If two curves with the equations F v (x, y) — 0 and F 2 ( x , y) — 0 
are given we can pose the problem of finding the point of intersection 
of the curves. The sought-for point of intersection must belong to 
both lines simultaneously and therefore its coordinates x and y 
must satisfy the equations of both lines. Hence, in order to determine 
the coordinates we must solve the following system of equations: 


Fi {x, y) = 0 1 

^2 fa V) = 0 I 


(8) 


Such a system of equations can have a number of distinct solutions 
and this number corresponds to the number of possible points of 
intersection. Of course, a solution is understood as a pair of certain 
values x and y satisfying (8). 

For example, let it be necessary to determine the point of inter- 
section of circle (7) with the straight line y = x -}- b where b is 
a constant. To do this we should solve the system of equations 
£ 2 + y 2 — 3x rf- 4y — 1 = 0 1 
y = x + b j 


6 * 
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Expressing y in terms of x from the second equation, substituting 
the expression thus obtained for y into the first equation, removing 
brackets and solving the quadratic equation in x we obtain after 
some transformations 


— 1 — 26 + 1/9 — 286 — 462 

x i 4 > 

— 1 — 26 — 1/9 — 286 — 462 

■*2 = 4 , 


Vl = 


— 1 + 26 + 1/9 — 286—462 


Vi. 


_ —1 + 26 — 1 / 9 — 286—462 



Let us find for what values of b both points of intersection coincide. 
This will happen in case the expression under the radical sign vani- 

shes which implies b t , 2 = , i.e. bj = 0.31 and b 2 — 

= —7.31. The straight line y = x + 6, as it is shown in Fig. 62, 
is tangent to the circle for these values of b. If b t < b < b 2 then 
there are two distinct points of intersection: (x lt i/j) and {x 2 , y 2 ). 
The straight line does not intersect the circle at all when b lies 
outside the above interval (in this case the expression under the 
radical sign is negative). 

It may be noted that in many different problems the coincidence 
of the points of intersection whose coordinates are found from a 
system of type (8) usually indicates that the two given curves are 
tangent to each other at this common point, that is they have a com- 
mon tangent line at the point. 

5. Equation of a Curve in Polar Coordinates. An equation con- 
necting the plane coordinates of points taken with respect to any 
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given coordinate system defines some curve in the coordinate plane 
(witli some exceptions to tliis rule which will he discussed in Sec. 8). 
In particular, let us consider a polar coordinate system. We suppose 
that the equation is solved for p, i.e. it has the form 

P = / (9) ( 9 ) 

Making qp assume all the possible numerical values and finding the 
corresponding values of p we receive the locus of these points which 



Fig. 63 

form a curve in the plane, that is the graph of function (9) in the 
polar coordinates. 

Let us take two examples. The graph of a linear relation of the 
form p = aq> -f b is shown in Fig. 63. This curve is called the spiral 
of Archimedes. It can be obtained by the superposition of a uni- 
form rotary motion of the radius-vector and a uniform rectilinear 
motion along the radius. Indeed, if 

p = vt + b and <p = at then p = qp + b 

Thus we see that the graph of one and the same function (a linear 
function in this particular case) regarded with respect to a Cartesian 
coordinate system and to a polar coordinate system has quite diffe- 
rent forms (see Sec. 1.22). 

The. graph of the exponential function p = e hr P in polar coordi- 
nates is depicted in Fig. 64. This curve is the so-called logarithmic 
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spiral. The curve infinitely winds round the pole (and tends to the 
pole as 9 — oo) but never reaches it. 

The logarithmic spiral has some interesting properties. For exam- 
ple, if we perform a similarity transformation, that is if we expand 
the spiral uniformly in all directions, with the similarity coefficient 
m (i.e. all the linear sizes are increased m times) we shall obtain a 

k( 1 * nm ) 

new curve with the equation p = me h v. But p = = e ' ‘ h ' = 

= e k ^ +a) where a = and therefore the result would be as 



if we turned the original spiral about the pole through the angle 
of a radians in the clockwise direction. Indeed, generally the graph 
of a function p = / (9 + a) is obtained from the graph of the func- 
tion p = / (9) by rotating the latter about the pole through the 
angle a in the negative (i.e. clockwise) direction. (Why is it so?) 
Consequently, the logarithmic spiral is similar to itself with any 
arbitrary similarity coefficient. Only straight lines possess this 
property among all the other curves in a plane. 

In conclusion, let us discuss the notion of coordinate curves. 
A curve is called a coordinate curve if one of the coordinates remains 
constant along the curve. The coordinate curves of a Cartesian coor- 
dinate system are two families of straight lines parallel to the coor- 
dinate axes. The curves p = const which are concentric circles with 
centre at the pole form a family of coordinate curves in a polar 
coordinate system. The second family of coordinate curves in a polar 
system are the curves 9 = const, that is the family of half-lines 
issued from the pole (see Fig. 65). 
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6. Parametric Representation of Curves and Functions. There are 
some cases when hoth coordinates of a point (for instance, the Car- 
tesian coordinates) belonging to the coordinate plane are represented 
as certain functions of a third variable which we denote by t: 

x = y (t), y — ij) (t) (10) 

This third variable serves as a parameter determining the position 
of a point having the coordinates (x, y) in the coordinate plane. 
When t varies the corresponding point moves in the plane describing 
some curve (L) (see Fig. 66). In this case we say that the curve (L) 
is represented in parametric form (10). We have such a situation when, 



Fig. 65 

for example, we investigate a motion of a point in the plane. In 
this case t is time, formulas (10) determine the law of motion and 
the curve ( L ) is called the trajectory of motion. In other problems 
of this kind a parameter entering into equations may have some 
other physical or geometrical meaning hut even then it is usually 
convenient to regard it as if it were time. It should be noted that 
one and the same curve (L) can be represented by many different 
forms of equations (10) since there can be different laws of motion 
along one and the same trajectory (for instance, students walking 
from a bus stop to their college can come 22 minutes or 2 minutes 
before the lecture begins). 

In order to pass from equations of a curve given in form (10) 
to an equation of general form (6) we must eliminate the parameter 
from both equations (10). For example, we can express t in terms 
of x from the first equation and then substitute the result into the 
second equation. Of course, we can use any other procedure which 
eliminates t. But this is not always possible and besides we can 
sometimes find it inexpedient to do so. Therefore we often retain 
the parametric form of representation. 
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Equations (10) define a certain functional relation y (x). Indeed, 
if we, for example, make x take on some value then the first equation 
defines some value of t (of course, there may be more than one such 
value) and the second equation defines some value (or several values) 
of y. We see that the function y ( x ) turns out to be represented in 
a parametric form and the curve ( L ) is its graph. 

Example 1. Let us consider the motion of a shell fired with the 
initial velocity v 0 at an angle a to the horizon (see Fig. 67). We 
shall disregard air resistance, sphericity of the earth and the earth’s 



rotation. Then the horizontal component of the velocity must be 
permanently constant and equal to vo x — v 0 cos a whereas the 
vertical component of the velocity changes all the time. The ver- 
tical motion has the constant acceleration of gravity g and therefore 
the distance passed along the vertical differs from the vertical dis- 
placement which corresponds to the constant speed vo y = Ro sin a 

by (thi s f ac t is well known from mechanics). Thus, we obtain 
the law of motion in the x, p-plane: x = ( v 0 cos a) t and y = 
= (u 0 sin a) t — . This is the law of motion that was sought 

for and at the same time it defines the trajectory of motion in the 
parametric form. Eliminating t we dedu'ce 

» = ( tan “) a -2^^ (11) 

The dependence y (x) being a quadratic one, the trajectory is a para- 
bola (see Sec. 1.23). 

Example 2. Let us now consider the trajectory of a point of a circle 
rolling upon a straight line without sliding (the dotted line in 
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Fig. 68 shows the position of the circle at the initial moment t — 0 
whereas the continuous line indicates some moving position). We 
shall regard the angle of rotation of the circle (denoted by ij)) as 
a parameter. Then 

x = OQ = OT — QT = uMT — MP = ity — ' 

— R sin tJ) = R (oj) — sin a}>) ^ 

y = QM = TN — PN = R — R cos ij) = [ 1 

= R (1 — cos t|>) ^ 

where the line segment OT is equal to the arc length of uMT accord- 
ing to the condition of the absence of sliding. Hence, we have obtain- 



Fig. 68 

ed the parametric equations of a curve which is called the cycloid. 
The curve is infinite and has cusps w r hich are seen in Fig. 68. 

The cycloid is one of the simplest curves belonging to the class 
of roulettes; a roulette is the path in a fixed plane of anj r point in 



a moving coincident plane -when a given curve in the latter plane 
rolls without sliding on a given curve in the former. The so-called 
curtate and prolate cycloids (see Fig. 69) which are described by 
a point rigidly connected with the plane of a circle and lvino-. res- 
pectively, inside or outside the circle when the latter is rollino- 
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upon a straight line give further examples of roulettes. By the way, 
the prolatejcycloid, as it is seen in Fig. 69, has nodal points. Other 
examples are the hypocycloid and the epicycloid which are described 
by a point of a circle rolling upon the other circle inside or outside 
it, respectively (see Fig. 70). A hypocycloid with the ratio of the 
radii of the circles equal to 1 : 4 is called the astroid. In case this 



Fig. 70 Fig. 71 

Astroid. Show that tho equality 
U TM ----- u TA implies the equa- 
tions x = a cos’ t, y ~ a sin* t 


ratio is 1 : 2 the hypocycloid turns into a straight line segment. 
The cardioid is an epicycloid with the ratio of the radii 1:1. These 
curves are depicted, respectively, in Figs. 71 and 72, -where the 
equations of the curves are also put down. The deduction of these 
equations is left to the reader. All these curves are important for 
applications in the theory of mechanisms. 

7. Algebraic Curves. If the equation of a curve ( L ) has the form 

P (x, ij) = 0 (13) 

in Cartesian coordinates x and y where P is a polynomial of degree n 
the curve (L) is said to be an algebraic curve of the nth order. Non- 
algebraic curves are called transcendental. For example (see 
Sec. 1.22), the graph of a linear function, that is a straight line, 
is an algebraic curve of the first order, a quadratic parabola and 
a circle are algebraic curves of the second order, the cubic parabola 
and the semicubical parabola [the latter curve has the equation 
2 _ 

y — x 3 which should he rewritten in the form y 3 — x 2 = 0 in order 
to obtain an equation of form (13)] are algebraic curves of the third 
order. On the other hand, the sinusoid, the tangent curve and the 
graph of an exponential function are transcendental curves. 
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As an example of a curve of the fourth order let us consider 
Cassinian ovals introduced by the Italian astronomer G. D. Cassini 
(1625-1712). Let us take two points and F 2 in a plane and form 
the product of their distances from a point M in the plane. The 
locus of all points M in the plane for which the product is a constant 
value is a Cassinian oval. Let us introduce Cartesian coordinate 


M 



Cardioid. Show that the equa- 
lity u TAT = u TO implies 
the equation p = 2a (1 - cos tp) 


axes in the way shown in Fig. 73. Denoting the coordinates of the 
points F^ F o and M as F^—a, 0), F„{a, 0) and M{x, y), respec- 
tively, we obtain, by formula (1), the equation 

(K(x -[- a)- + i/ 2 )*(]/'(x — a)" + y~) = b~ 

where b- is the given constant value of the product. The last equation 
implies the final equation 

(x- + iff - 2 - if) = V -a* 

which can be easily deduced. Here we have an important special 
case when b = a and the curve is oo-shaped and called the lemni- 
scate. The lemniscate was discovered in 1694 by the Swiss mathe- 
matician Jakob Bernoulli (1654-1705). Passing to the polar coor- 
dinates by means of formulas (5) we deduce the polar equation of 
the lemniscate p 2 = 2a 2 cos 2cp. In case 0 < 5 < a a Cassinian oval- 
consists of two separate parts. 

It should be noted that when we speak about the degree of an 
algebraic curve we always mean an equation in Cartesian coordi- 
nates^ For example, the equation of the spiral of Archimedes (see 
Sec. 5) has the first degree in the polar coordinates but if we rewrite 

it in the Cartesian coordinates we get Y x? -{- if — a arc tan — -j- b 
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and therefore we see that the spiral is a transcendental curve. Spi- 
rals of all other types are also transcendental and, generally, it 
turns out that all infinite curves possessing a periodicity property 
are also transcendental. 

The degree of an algebraic curve does not change if we replace a given 
Cartesian coordinate system by another one. 

Actually, for example, if we perform a parallel translation of 
a coordinate system (see Sec. 2) then equation of a curve (13) turns 
into 

P(x -f- a, y' -f- b) — 0 

The degree of a polynomial which is obtained after removing 
brackets and collecting similar terms cannot become higher than 
the original one (obviously, terms of higher degree cannot appear 
here in this case). One may think that after collecting terms the 
degree of the polynomial could decrease in case the terms of higher 
degree mutually cancel out. But this cannot occur since otherwise 
the degree of the polynomial should increase under the inverse trans- 
ition from x ' , y' to x, y and this, as it was shown above, is impossible. 
An analogous situation takes place for all other types of transfor- 
mation of Cartesian coordinates. 

A change of a coordinate system for an immovable curve is • 
equivalent to the motion of the curve (in the opposite direction) 
as the axes of coordinates remain immovable and therefore the 
degree of an algebraic curve is invariant ( unalterable ) when the curve 
moves as a whole. 

A quantity or, in general, an object which does not change under 
some transformations is called an invariant of these transformations 
(or an invariant with respect to the transformations). For example, 
the area is an invariant of motions and angles are invariants not 
only of motions but of similarity transformations as well. Thus, the 
order of an algebraic curve is also an invariant of motions. This 
very important concept of an invariant was unfamiliar to an owner 
of a garden who was painting the fence. Knowing that he did not 
have enough paint he was doing the job extremely fast so as to 
finish it before the paint was used up. 

8. Singular Cases. There are some equations of the form F{x, y) = 
= 0 which define in the x, y-plane such a set of points that it would 
hardly be possible to call it a line or a curve. We shall illustrate 
such cases by some examples. 

The equation x 2 + y 2 + 1 = 0 has no points in the coordinate 
plane which satisfy it since the left-hand side is positive for all 
x and y. An equation of this kind is said to describe an imaginary 
curve since if we use complex numbers we can put, for instance, 
x = i, y = 0 etc. But this term does not change the fact that such 
a curve does not exist as a real curve in the plane. 
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Only one point x = 
satisfies the equation 


0, y = 0 (the origin of the coordinate system) 


x" -f l/~ = 0 


□ 


( 14 ) 


Comparing this equation with the equation 

*2 + y z _ i?2 = 0 (15) 

(see Sec. 4) we may consider equation (14) as if it defined a circle 
with zero radius. Generally, if an object depends on parameters 
and if it changes considerably and loses some of its essential pro- 
perties for certain values of the parameters we usually say that the 
object degenerates for these values. In the above case circle (15) 
depends on the parameter R. It de- 
generates into point (14) for R = 0 
and loses its main property, i.e. the 
property to be a curve. 

The equation of an algebraic curve 
of the second order of the form 

y » _ a? = 0 (16) 

can be rewritten as 

(y - x) {y + x) =0 

But a product is equal to zero if and 
only if at least one of its factors equals 
zero. Therefore, either y — x—0, 
i. e. y — x , or y- 4-2 — 0, i. e. y — — x 
(see Fig. 74). Each of these equa- 
tions determines a straight line in the x, y-plane. Consequently, 
a point satisfying equation (16) lies either on the first straight line 
or on the second one. Thus, a curve defined by equation (16) is 
nothing but a pair of straight lines or, as we say, it disintegrates 
into a pair of straight lines. Hence, we see that the algebraic curve 
of the second order disintegrated into two straight lines. 

But we can by no means regard a hyperbola as a curve which 
disintegrates (Sec. 1.25). In this case we have a curve consisting 
of two components (branches) and each of these components is 
a half of the hyperbola but both branches are described by one and 
the same equation. As for the case of equation (16), each of the 
straight lines obtained above has its own self-dependent equation. 
It appears quite clear that in this way we can artificially unite two 
arbitrary curves. If these curves have equations F { (x, y) — 0 and 
-^2 y) — 0 then it is sufficient to take the equation 

(x, y) -F 2 (x, y) = 0 

We cannot regard as a disintegration of an algebraic curve the 
decomposition of the parabola (shown in Fig. 23) into its upper and 
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lower halves according to the formula 

V 2 _ x = (y — Yx) (y + Yx) = 0 (17) 

from which we obtain y ll2 = ±Y x instead of y 2 — x = 0. Here 
the key point is that we have a disintegration when a polynomial 
on the left-hand side of equation (13) is factored into polynomials 
whereas the factors entering into the right-hand side of (17) are 
not polynomials (see Sec. 1.17). 

§ 3. First-Order and Second-Order Algebraic 
Curves 

9. Curves of the First Order. As is shown in Sec. 7, in order to 
obtain an algebraic curve of the first order we must take a polyno- 
mial of the first degree and equate it to zero. Such a polynomial 
may contain only terms of the first degree and an absolute (con- 
stant) term. Therefore the general form of an equation of a curve 
of the first order is the following: 

Ax -f- By + C = 0 (18) 

There may occur two cases. If B =f=. 0 then dividing the equation 
by B and denoting 

“4 = *> -TT = b ( 19 > 

we receive 

y = kx -f- b (20) 

We have shown (see Sec. 1.22) that this is the equation of a straight 
line (we had a instead of k but this fact is of no importance). This 
line is depicted in Fig. 75. In case B = 0 we divide the equation by 

Q 

A and denote — -j- = a. This yields an equation of the form x = a. 
which is the equation of a straight line parallel to the y-axis. It 
should he noted that for such a line the slope k = tan = ±oo 

which also follows from relation (19) but in this case the equation 
cannot be -written in form (20). 

Thus, the curves of the first order are straight lines. 

Let us consider several simple problems involving equations of 
straight lines. 

1. Let it be required to construct a straight line with the given 
slope k passing through the given point (x it When we say “to 
draw” or “to construct” a straight line we mean, of course, in terms 
of analytic geometry, to write down its equation. The sought-for- 
equation has form (20) but b entering into the equation is unknown. 
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But the straight line passing through the given point, the coordi- 
nates of the point must satisfy the equation of the line: y t = kx t +1 b. 
Subtracting and thus eliminating b we arrive at the sought-for 


equation: 


y — y i = k(x — x j) 


( 21 ) 


Making k change and take all the possible values we obtain the 
pencil of all possible straight lines passing through the point fo, j/i). 
We can also put k = ±oo and thus obtain the vertical straight 




line. But to do this we should divide both sides by k beforehand; 
then after the substitution of A; — ±°° we simply obtain 0 = 
= x — x,. i.e. x = x t . Similar precautions should be taken in other 
problems involving infinite values of parameters. 

2. Let it be required to draw a straight line through two given 
points (x {1 i/j) and {x 2 , y 2 ). The equation of the desired straight 
line has form (21) but now k is unknown. But the condition that 
the straight line should pass through the second point implies y 2 — 
— Hi = k (x 2 — x t ). Performing the division we thus eliminate k 
and receive 


JLzJH. = JL Z£L ( 22) 

v-l— yi x z —xi ' ' 

In this equation and in equation (21) x and y are current coordinates 
of a moving (variable) point of the sought-for straight line. 

_ 3- Let it be required to determine the angle formed by two straight 
lines with given slopes k t and Av The solution is implied by 
Fig. 76: 


tan a = tan (qj, — qq) - tanq> a -tanq^ 
\Y 2 ri/ 1 4 . tan cpi tan 


1 -f - ^*1^2 


(23 


i t a 'x i 

4. The condition for the parallelism of two straight lines is obvi 
ous: k t = k„. 

5. The condition for the perpendicularity of two straight line 
follows from problem 3: we have a = ^ for mutually perpendicula 
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straight lines but tan — = ±o° and therefore 1 -f k t kn = 0 or 



10. Ellipse. An ellipse is the locus of all points in a plane for which 
the sum of their distances from two fixed points in that plane (these 
points are called the focuses of the ellipse) is a constant quantity. 
This definition enables us to use the method of drawing an ellipse 
with the help of a taut thread illustrated in Fig. 77. This procedure 



allows us to visualize the form of an ellipse: the ellipse is a closed 
convex curve possessing two symmetry axes (called the principal 
axes of the ellipse) and having the centre of symmetry 0 (called the 
centre of the ellipse). 

In order to deduce the equation of an ellipse in the simplest form 
let us place the coordinate axes in the way shown in Fig. 77 and 
denote F t F 2 = 2c and r t r 2 = 2 a where c and a are constants. 
Then on the basis of fo rmula (1) we c an wr ite, according to the defi- 
nition of an ellipse, Y (x + c) 2 + y 2 + Y (x — c) 2 + y 2 = 2a from 
which we obtain, in succession, 

V (x + c) 2 + y 2 = 2a - Y (x - c) 2 + y 2 . 

(x + c) 2 + y 2 = 4a 2 — 4a Y {x — c) 2 +y 2 + (x — c) 2 + y 2 , 
a Y (x — c)- + y 2 = a 2 — ex, a 2 [(x — c) 2 + y 2 ] = a 4 — 2 a 2 cx + c 2 r 2 , 

.c 2 (a 2 - c 2 ) + a 2 y 2 = a 2 (a 2 - c 2 ) (24) 

It is seen, from the triangle FiMF 2 , that 2a > 2c, that is a 2 — 
— c 2 > 0. Denote for brevity a 2 — c 2 = b 2 . Then the last of rela- 
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tions (24) implies 
the ellipse 


the so-called canonical form of the equation of 



(25) 


This equation again shows that the coordinate axes serve as the 
symmetry axes of the ellipse because if a point (p, q ) satisfies equa- 
tion (25) then the points (— p, q), (— p, —q) and (p, —q) also satisfy 
it (see Fig. 78). 



y — +b. Hence, a and b are, respectively, the lengths of the semi- 
major axis and the semi-minor axis of the ellipse (see Fig. 78: AO — 
= OC = a and DO — OB — b). Besides, each summand entering 
into the left-hand side of (25) cannot be greater than unity and there- 
fore | x | ^ a and \y | ^ b. Consequently, the whole ellipse lies 
inside the rectangle depicted in Fig. 78. The points A, B, C and D 
at which the ellipse intersects its symmetry axes are called the 
vertices of the ellipse. An ellipse' has four vertices. 

The ratios = ~=J / / T__ _ - , 0 < 8 < 1, is called the eccen- 
tricity of the ellipse. This is a dimensionless quantity which does 
not change under a similarity transformation of the ellipse when 

all its sizes are increased k times since — — . 

m, . *« a 

me eccentricity of an ellipse characterizes its form (its “elong- 
ation’) but not. its sizes. Several ellipses are depicted in Fig. 79; 
they all have the fixed length 2a of their major axes whereas their 
eccentricity e varies, c = ea and b — a Y 1 — e 2 > This enables us 
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to see how the eccentricity aSects the form of the ellipse: the focuses 
draw together and the minor axis tends to the major one in its length 



Fig. 79 


as s decreases. Passing to the limit as e 0 we have e = 0, c — 0 
and b = a, that is we obtain a circle. Consequently, a circle may 

be regarded as a singular (limiting) 
case of an ellipse whose focuses 



merge and coincide with the centre 
of the circle; in this case the ec- 
centricity is equal to zero. On the 
contrary, if e approaches 1 the 
ellipse becomes more and more 
elongated and it degenerates into 
a straight line segment in the 
limiting process. 

An ellipse can be obtained by 
a uniform contraction of a circle 
in a certain direction. Indeed, let 
us, for instance, consider the 


Fig- 80 uniform contraction towards the 


s-axis when all the sizes in the 


direction of the y-axis decrease k times (see Fig. 80). If a point 
M ( x , y) lies on the curve obtained as a result of the contraction then 
the point N (x, ky) must belong to the circle. This implies x 2 -f- 
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-f (ty) s = -R 5 or + -Xr* = 1. i- e - we § et an ellipse with the 


semi-axes L R and y. 

We can easily deduce now the parametric equations of an ellipse 
using the above proved property. In fact, the equations x — R cos t 
and y = R sin t (0 < t < 2n) define the circle of radius R (R is 
given) with centre at the origin of the coordinate system. (Verify 



this!) Performing the contraction we obtain x — R cos t and y — 

= R s ^ n 1 . If now we introduce the semi-axes a — R and b — 

we finally derive the parametric equations of the ellipse: 

x = a cos t, y — b sin t (0 ^ t 2n) * (26) 

We shall show in Sec. XI. 6 that a uniform contraction of an 
ellipse again yields an ellipse. 

As is known, an orthogonal projection of a plane figure results 
in a uniform contraction of the figure and therefore the orthogonal 
projection of a circle yields an ellipse (see Fig. SI). A' 

We again obtain an ellipse when considering the section of a right 
circular cylinder or of a cone by a plane. Fig. 82 shows such a section 
for the case of a cylinder. In order to prove that the section repre' 
sents an ellipse we inscribe two spheres into the cylinder in such 
a way that they should touch the plane at the points F t and F 2 - 
As is known, two tangents drawn to a sphere from a common point 
are equal. Therefore, we can write, for any point M belonging to the 
section, MF t -f- MF, = MN i + MN 2 = NiN z = const (see 
Fig. 82). This relation implies that our assertion is true. The cor- 
responding construction for the case of a cone is analogous. The 
properties of an ellipse are widely used in drawing. 

11. Hyperbola. We have already dealt with a hyperbola (see 
Sec. 1.25). But let us now forget about it for a while; later on in 
Sec. 13 we shall establish the connection between Secs. 11 and 1.25. 
Here we shall give a new definition of a hyperbola: a hyperbola is 

7 * 
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the locus of all points in a plane for which the difference between their 
distances to two given points (called the focuses of the. hyperbola) 
is a constant quantity. Choosing the coordinate axes as is shown in 
Fig. 83 and denoting F^F 2 — 2c and r t — r 2 = ±2 a we obtain the 
equation of the hyperbola in the form Y ( x + c) z 4- y 2 — 
— Y i x — c )~ + y 2 = ±2 a. Now carrying out some transformations 
similar to those in Sec. 10 we arrive at the same relation (24). (Check 
it up. 1 ) But in this case the triangle F ( MF 2 implies 2a <T 2c and 




therefore we cannot denote a 2 — c 2 = b 2 as it was done in Sec. 10. 
(Why is it so?) Therefore we denote a 2 — c 2 — — b 2 which is per- 
missible. Then we deduce from (24) the relation 

— b 2 x 2 -f- a 2 y 2 = — a 2 b 2 


and finally get the canonical form of the equation of the hyperbola: 


X 2 



(27) 


This equation shows that a hyperbola has two symmetry axes 
(its principal axes) and a centre of symmetry (the centre of the hyper- 
bola). Putting y = 0 we get x — and putting i = 0we obtain 
y = ±ib. Consequently, the x-axis intersects the hyperbola at two 
points (which are the vertices of the hyperbola). This axis is called 
the transverse axis of the hyperbola. The p-axis does not intersect 
the hyperbola and is called its conjugate axis. The constants a and 
b are called the semi-axes of the hyperbola. 
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Besides, equation (27) shows that 1, that is x must be either 

< —a or >a (see Fig. 84). „ , . . . 

A hyperbola has two asymptotes. We shall demonstrate this 
property restricting ourselves only to the case of the first quadrant 
(which is sufficient because of the symmetry properties of a hyper- 
bola). From (27) it follows that 

vh„^Y i=4l/Yw 

Later on (in particular, in the end of Sec. IV.22) we shall present 
some general rules for investigating expressions of this kind. But 



we are not familiar with these rules yet and therefore let us apply 
certain artificial transformations which make it possible to “educe” 


— x from Y x 2 — a 2 : 


o 2 = [a: -L (]/ x 2 — a 2 — x)]=^x-\-^{ Y x 2 ~a- — x) 


To investigate the behaviour of the second summand — (Y x 2 — a 2 — .r) 

as x increa ses let us multiply and, simultaneously, divide the Sum- 
mand by Y x 2 — o 2 + .t. This yields 


Uhyp — 



t {Vx-~a-—x) (Vj 2 — qg-f-g) _ r ab 

a ~V — a “ T" * * ~Y X“ — -J- X 


lbe fraction entering as the second summand into the right- 
hand side unlimitedly approaches zero when the variable point M 
travels into infinity along the hyperbola. Now taking the straight 
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line y str = ^ x we see that the difference 6 = y slT — y hy unli- 
mitedly approaches zero as x oo and therefore this straight line 
is an asymptote of the hyperbola. Taking into account the symmetry 
we finally obtain the equations of the asymptotes: 

b 

J — a 

It can be shown that the section of the surface of a right circular 
cone (infinitely prolonged in both directions) by a plane which forms 
an angle with the axis of the cone is a hyperbola 
provided the angle is less than the one formed by 
the axis and the slant side of the cone (see Fig. 85). 
Try to prove this assertion reasoning as in the end 
of Sec. 10. 

The properties of a hyperbola can be illustrated 
by the following example. Suppose a sound signal 
is issued from a point A. Let this signal be 
received at two points B and C. Suppose the signal 
is received t sec earlier at B than at C. Then we can 
guarantee that the point A lies on a part of a 
hyperbola (the nearest to the point B) having its 
focuses at B and C and the transverse semi- 

Fig. 85 axis where v s is the speed of sound (let the 

reader explain this fact). If two experiments of 
this kind are carried out the position of the point A is determined 
as the point of intersection of the corresponding hyperbolas. 

1 2. 1 Relationship Between Ellipse, Hyperbola and Parabola. There 
is a close relationship between an ellipse (see Sec. 10), a hyperbola 
(see Sec. 11) and a parabola (see Sec. 1.23). This can be accounted 
for by the fact that all these curves are algebraic curves of the second 
order and, as it will be shown in Sec. 13, there are no other curves 
of the second order except for some singular cases similar to those 
discussed in Sec. 8. There are many problems involving parameters 
whose solution is one of these curves depending on the values of 
the parameters. In such circumstances the parabola (or its degenera- 
ted forms) usually occupies an intermediate position between the 
ellipse and the hyperbola. 

Let us consider the intersection of a right circular cone (depicted 
in Fig. 86) with a plane which turns about an axis drawn perpen- 
dicularly to the axis of the cone (for example, about the axis pp). 
When the slope is slight (we mean the angle between this plane 
and the plane perpendicular to the axis of the cone) we have an 
elliptic section. The ellipse elongates as the slope increases and its 
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eccentricity also increases. When the plane intersects both parts 
of the double cone we have a hyperbolic section. Besides, we see 
that in the intermediate position when the plane is parallel to the 
slant side of the cone we have an intersection line which is infinite 
but still consists of one component. There will he no singular cases 
here similar to those indicated in Sec. 8 and, besides, a degeneration 
of an algebraic curve of the second order cannot yield a curve of a 
higher order. Hence the intersection line is a parabola. On the basis 
of these properties an ellipse, a hyperbola and a parabola are called 


conic sections. 

Let us now apply the same point of view to the discussion of the 
simplest equation of an algebraic curve of the second order in polar 
coordinates. We begin with the polar equation of an ellipse. Let 
us put the pole 0 at the right focus (see Fig. 87). Applying the cosine 
law to the triangle AMO where A is 
the left focus and M is a variable 
point of the ellipse we obtain 

AM 1 = AO- + OM 2 — 2 AO x 
xOM cos (180° — cp); 

(2a — p) 2 = (2c) 2 -f-p 2 -f- 
+ 2-2cp-cos cp; 

4 a 2 — 4ap -{- p 2 = 4 (a 2 — b") -J- 

-f- p 2 + 4cp-cos cp 
which implies 


62 


62 

a 


a-f-c cos <p 


Denoting — = p for brevity (p 

called the focal parameter of the 
ellipse) we finally deduce the equa- 
tion 


+ (t) 


cos cp 


P 



Fig. S6 


1 -j-S COS cp S' ' 

(In case the pole is placed at the 
left focus we shall have the expres- 

1 ~ e cosep in the denominator.) If now we take a hyperbola 

transform lts lef * focus ( see Fi S- 88) then after similar 

transformations (which we leave to the reader) we arrive at the 

same formula (28). If we then ; denote ~ = p and — = g the equa- 

agaiB - But inCthis case dimensionless 
quantity £ (also called the eccentricity of the hyperbola) should be > 1 . 
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It is easy to verify that if we take equation (29) with e = 1 and 
pass from the polar coordinates to the Cartesian coordinates accor- 
ding to formulas (5) we obtain the equation of a parabola. In fact, 

P = ilcosy and P + P cos 9 = P which implies ]/ x 1 + if + 

+ x — p, y x 2 + y 2 — p — x and x 2 + y 2 — p 2 — 2 px -f x 2 . The- 
refore, finally, 




The pole of the polar coordinates which -was introduced above 
is called the focus of the parabola. We see that a parabola, in con- 
trast to an ellipse or a hyperbola, has only one focus. Now recall 
an interesting fact shown in Fig. 79: in case e = 1 an ellipse turns 
into a line segment. Thus, the degeneration may yield different 
results in different problems. 

Equation (29) is applied, in particular, to the problem of motion 
of two bodies under their mutual Newtonian attraction which is 
known as the problem of two bodies in celestial mechanics. Let us 
consider, for example, launching an artificial satellite of the earth 
from a point T (lying outside the earth’s atmosphere) in the hori- 
zontal direction (see Fig. 89). If the initial velocity v 0 is not suffi- 
cient the satellite will not rotate round the earth. When the “first 
cosmic velocity"’ is achieved the satellite will rotate round the 
earth in a circular orbit with centre at the centre of the earth. If 
the velocity v 0 is then increased the rotation will be in an elliptic 
orbit and the centre of the earth will be at one of the focuses of the 
ellipse. 

The further increase of the velocity v Q makes the eccentricity of 
the ellipse increase and the second focus of the ellipse moves off 
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the first one. After the “second cosmic velocity” (the escape velocity) 
has been achieved the .trajectory becomes parabolic and the satellite 
will not return to the point T. Therefore a parabola may be regarded 
as an ellipse with one of its focuses removed into infinity. The further 
increase of the velocity makes the trajectory turn into a hyperbola 
and thus the second focus appears again but this time “on the other 
side”. The centre of the earth remains at one of the focuses of the 
orbit all the time. 




There is one more way of defining conic sections. Rewrite equation 
(29) in the form 


p + pe cos <p = p or p — p — pe cos (p = e 


(-7 — P cos cp) 


Now we see that the expression in the parentheses obtained above 
is just equal to the length of the line segment MM' shown in Fig. 90 
(verify this!) where the straight line ll is drawn perpendicularly 

to the polar axis at the distance ~ from the pole. But p = OM, 

that is we obtain OM = zMM’ which implies — e = const. 

Thus, an ellipse, a hyperbola and a parabola can be defined in a 
new way as the locus of all points in a plane for which the ratio of 
their distances from a certain point ( which is a focus) to their distances 
from a certain straight line (the so-called directrix) is a constant 
quantity. 

13. General Equation of a Curve of the Second Order. Let us put 
down the general form of an equation of an algebraic curve of the 
second order [as an analogue to equation (18)]: 


Ax 2 + 2 Bxy + Cy 2 -f- Dx + Ey + F = 0 


(30) 
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T 


(we write 2 B instead of B because, as it will be seen, this simplifies 
some formulas which will be further obtained). Now our aim is to 
transform the Cartesian coordinates (see Sec. 2) in such a way that 
equation (30) should take the simplest form; this will enable us to 
find out what curve is determined by the equation. 

The canonical equation containing no terms with the product of 
the coordinates, we first of all try to turn the coordinate axes in 
such a manner that this product should be eliminated. According 
to formulas (4), after the axes are turned through an angle a, the 
equation in the new coordinates will have the form 

A ( x ' cos a — y' sin a) 2 + 2 B (x' cos a — x/ sin a) ( x ' sin a -f- 
-f- y’ cos a) + C (x' sin a -f- y' cos a) 2 -f- D (x' cos a — y' sin a) -j- 
-f E {x' sin a + y' cos a) + F = 0 (31) 

Removing the parentheses and collecting the terms containing the 
product x'y' we see that the coefficient in x'y' is equal to 

— 2 A cos a sin a -f- 2 B cos 2 a — 2 B sin 2 a + 2C sin a cos a = 

= 2 B cos 2a -f (£7 — A) sin 2a 
Thus, we must have 

2B cos 2a + {C — A) sin 2a = 0, i.e. tan 2a — ^ B q (32) 

Now we find the angle a from equation (32) and thus determine 
through what angle the coordinate axes should be turned. 

After the axes are turned through the angle a the equation takes 
the form 

A V 2 + C'y ' 2 + D’x + E’y' + F = 0 (33) 

Avhere A', C ' , D' and E' are some coefficients which can be found 
by collecting similar terms in (31). But, as it will b*e proved in 
Sec. XI. 11, a rotation of the coordinate axes does not change the 
quantity AC — B 2 although the coefficients A , B and C in terms 
of the second degree may vary, that is the expression is an inva- 
riant. There is no term with x'y' in equation (33) and therefore, 
since B r = 0, we obtain 

A'C — B'~ =\A'C' = AC — B 2 

Consequently, if the expression AC — B 2 written for original 
equation (30) is positive then the coefficients A' and C' in equation 
(33) have the same signs since their product is positive. Such a case 
is called elliptic. If AC — B 2 < 0 then A' and C' are of opposite 
signs (this is a hyperbolic case). Finally, if AC — B 2 = 0 then one 
of the coefficients {A' or C') is equal to zero (the so-called parabolic 
case). 
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Let us turn to the elliptic case. Completing the square in equation 
(33) (compare with Sec. 4) we arrive at an equation of the form 

A' (*' -a) 2 + C' (y' -b) 2 + F' =0 

Let us now translate the axes x' and y' (see Sec. 2) and pass to the 
new coordinates x" — x' — a and y" = y' — b. Then the equation 
turns into 

A'x- + C'y" 2 = —F', that is -| — ~r = 1 (34) 

~A r 

For the sake of definiteness let us suppose that both A' and C' 
are positive. Then if F' < 0 we obtain the canonical equation of 
an ellipse. Consequently, the 
original curve is an ellipse 
but displaced and turned with 
respect to the axes x and y. 

If F' > 0 or F' = 0 we obtain 
the singular cases mentioned 
in Sec. 8 since there Avill be, 
respectively, either an imagi- 
nary curve or a single point. 

Similarly, in the hyperbolic 
case the curve must be a hy- 
perbola and in the parabolic 
case the curve must be a pa- 
rabola with the exception of 
the singular cases described in 
Sec. 8 which, as a rule, are 
of no significance. 

As an example, let as again consider the graph of the function 

V = — which expresses inverse variation (see Sec. 1.25). The equa- 
tion can be rewritten as xy — k = 0. Comparing with equation (30) 
we see that in this case A=C=D— E = 0, B — and F = ~k. 

Since AC — B 2 = — ~ < 0 we have a hyperbolic case. Formula (32) 

now implies tan 2a — q" — ±°° and this means that we can put 

2“ = ~y or a — ■-£- • Let us rotate the axes through the angle of 45°. 
According to formulas (4) wo have 



, 1/2 
—v V 


^(x'-y% 
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Substituting the expressions into the original equation we obtai 


(z’-y') 30- (z r + y ') 0 


x'~-y ' 2 k -q 

2 * U ’ 2J-- 2k — 1 


Consequently, the curve in question is a hyperbola which has equa 
semi-axes; a = b = ]/~2/e (see Fig. 91). The fact we have establishe 
here accounts for the usage of the term “hyperbola" in Sec. 1.2? 


CHAPTER III 


Limit. Continuity 


§ 1. Infinitesimal and Infinitely Large Variables 

1. Infinitesimal Variables. Infinitesimal variables form a very 
important class of variables which is of great significance in higher 
mathematics. A variable changing in a certain process is called infi- 
nitesimal if in this process it approaches ( tends to) zero unlimitedly. 
For instance, let us consider the process of expansion of a given 
mass of gas. If the gas expands unlimitedly then its density and 
pressure are infinitesimals; this is an example of positive, conti- 
nuous and monotonic (see Sec. 1.5) infinitesimal variables. In the 
process of damped oscillations of a pendulum its angle of deviation 
from the equilibrium position also represents an infinitesimal variable 
as time increases but this variable is an oscillating one and assumes 
both positive and negative values (and the zero value as well). If 

1 ‘1 1 

we take the sequence a t = ~ , a z = — -p , a 3 — — — , ... then 

1 

its general term a n = — — is a discrete and negative infinitesimal 

variable in the process in which the number n increases: n = 
= 1, 2, 3, .... It should be noted that when a variable is quali- 
fied as an infinitesimal one we must point out a certain process in 
which the variable changes since the same variable may not be 
an infinitesimal at all in some other process. 

As it has been stated an infinitesimal variable a “approaches 
zero unlimitedly”. Let us discuss this term in detail. Suppose a 
variable a changes in some process and tends to zero. Then there 
must exist a moment from which on we shall necessarily have | a | •< 
< 1; similarly, there must exist some other (later) moment from 
which on \<x j <C 0.1. By the same reasoning, there must exist 
the third moment (following the first two moments) from which 
on we shall always have | a | < 0.01 and so on. This can be expres- 
sed in the following manner: for any e > 0 there must exist a certain 
moment in the development of the process from which on there 
will always be | a [ < e. It is not necessary to indicate such a mo- 
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ment virtually in all cases: we should only be sure of the very exis- 
tence of the moment. Hence, an infinitesimal variable may not be 
small at all at the beginning of the process in which it changes and 
the essential fact is that it becomes arbitrarily small (in its absolute 
value) when the process goes on sufficiently long. 

In addition, let us now state more precisely what the expression 
“a moment” in the development of a process means. If we consider 
a process developing in time then “a moment” simply means a 
certain moment of time. The realization of a process may not be 
connected with the time but may be related to some other variable 
(for instance, in the third of the above examples the process is cha- 
racterized by the change of the number n which assumes the succes- 
sive values 1, 2, 3, . . .); in such cases the existence of “a moment” 
means that a variable which characterizes the process takes on 
a certain value. 

Taking into account the specifications given in the last two para- 
graphs we can say that the general term a n of a sequence is an infi- 
nitesimal variable in the process which is characterized by the 
increase of the number n if for any arbitrarily chosen e >0 it is 
possible to indicate a number N = N (e) such that there will he 
[ a n | < e as n becomes larger than N (n >N). 

The notion of an infinitesimal variable can be specified in an 
analogous manner for other types of variables and processes but 
we shall not use them further. 

From the point of view of the definition a constant value, even 
a very small one, is not an infinitesimal variable; only the constant 
value equal to zero is an infinitesimal variable from the formal 
point of view of the above definition. 

Now we must point out that the definition of an infinitesimal 
variable which we shall use here involves the following principal 
difficulty: there are no real variables which may approach zero 
unlimitedly. Indeed, in the examples considered above the gas 
cannot expand unlimitedly and the real pendulum stops oscillating 
after some time has passed. Furthermore, if we take into account 
the molecular structure of a substance we see that we cannot take 
an infinitely small mass of the substance because it is impossible 
to consider the mass of the substance which is smaller than the mass 
of its molecule; similar situations are observed in other examples. 

Thus, our definition applies only to a mathematical model of 
a real process in which the real situation is simplified so that the 
application should become possible. Therefore we speak about a 
pendulum which has oscillations lasting infinitely long and about 
a “continuous” (non-molecular) structure of a substance and so forth. 
We always replace a real process by its mathematical model but 
this must be performed in such a way that the main features of the 
process we are interested in should not be considerably changed. 
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But nevertheless we always deal with a model and we must not forget 
it. Otherwise some principal mistakes may occur. For example, one 
can make an attempt to attribute all the properties of a model to 
the reality without sufficient reasons. 

There is another possibility^ of interpreting the usage of the notion 
of an infinitesimal in practical applications. Let us discuss it here. 
The practical (“physical") infinity should be distinguished from 
the mathematical concept of infinity. Thus, a “practical” infinite- 
simal is a variable or even a constant which is sufficiently small in 
comparison with “finite" values which are involved in a certain 
investigation, i.e. so small that it should be possible to apply to 
it all the properties of “mathematical” infinitesimals without any 
considerable error. At the same time this value must not be too 
small, that is so small that it should be necessary to take into ac- 
count some effects of the microstructure when it is inexpedient, 
or so small that it should not comply with the real possible values. 
For example, suppose we are studying the deformations of an elastic 
body; then certain sizes which are sufficiently small in comparison 
with the size of the body and, at the same time, sufficiently large 
in comparison with the molecular sizes may be regarded as infi- 
nitesimals etc. 

In what follows we shall use the definition given in the beginning 
of this section but from time to time we shall come back to the 
considerations discussed here. 

2. Properties of Infinitesimals. The properties of infinitesimals 
are directly implied by the definition given in Sec. 1. 

1. The sum or the difference of two infinitesimals is also an infi- 
nitesimal variable. Indeed, if each summand approaches zero then 
the sum does the same. Similarly, the sum of three, ten and, in 
general, of an arbitrary finite number of infinitesimals is also an 
infinitesimal. We point out here that there are some circumstances 
when in the development of a process the number of summands ente- 
ring into the sum increases infinitely; then even if each summand 
is an infinitesimal variable the whole sum may not be infinitesimal. 
For instance, 



n times 
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Here we have the situation described above when n increases; but 
the first sum is an infinitesimal variable, the second sum is constant 
and the third sum even increases unlimitedly. 

2. The product of an infinitesimal variable by a bounded variable 
(see Sec. 1.5) is again an .infinitesimal variable. Let, for example, 
the first factor be always in the limits from 0 to 1000 and let the 
second factor assume the successive values 1, 0.1, 0.01, 0.001 and 
so on. Then the values of the product will be, respectively, smaller 
than 1000 X 1 = 1000, 100, 10, 1, 0.1, 0.01, 0.001 and so forth. 

In particular, this property implies that the product of an infi- 
nitesimal by a constant is an infinitesimal variable. The product of 
two infinitesimals is an infinitesimal since an infinitesimal variable 
is, of course, a special case of a bounded variable. Similarly, the 
product of any arbitrary number of infinitesimals is an infinitesimal 
variable. 

We remark that the ratio of two infinitesimals may not be an infi- 

11 11 

nitesimal. If, for example, a = — , B = and y = ! — : where 

r n r n 2 ‘ n n 1 

n takes successive values 1, 2, 3, . . . then the variables a, p and 

y are infinitesimals. But at the same time the first of the ratios 

- = — , L=l-f — and %■ = n is an infinitesimal while the 
ana n fi 

second approaches 1 and the third even increases unlimitedly. We 
shall discuss such ratios in detail in § 3. 

3. Infinitely Large Variables. A variable x is called infinitely large 
in some process of changing if it increases in this process in its absolute 
value unlimitedly, then we write j x | oo. An infinitely large 
variable may be positive and then we write x — -}~oo or negative 
(then we write x — oo) but it may also change its sign: for in- 
stance, the variable x n = ( — 2) n assumes the values — 2, 4, — 8, 
16. ... as the number n increases and therefore it is infinitely 
large but we cannot say that x n -»• oo or x n -v — oo. The compre- 
hensive statement of the notion “increases infinitely” is analogous 
to the one given in Sec. 1 for the notion of “infinitely approaches 
zero” but, of course, here we should consider inequalities of the 
form | a: | > A r . This means that from a certain moment on the 
variable must satisfy the inequality | x | > 1 and from some other 
(later) moment on it must satisfy the inequality | x | >10. Further, 
there must exist a certain moment from which on we shall have 
[ x | >100 and so on. 

Now let us discuss some simple properties of infinitely large 
variables. A variable which is the inverse of an infinitely large variable 
is an infinitesimal and, conversely, the inverse of an infinitesimal vari- 
able is an infinitely large variable. These properties can be condi- 
tionally expressed as 

— = 0 and -i- = ± oo 

oo 0 



LIMIT. CONTINUITY 


113 


We shall use this notation but one must understand it correctly. 
For example, the first of the properties means that if the variable 

x entering into the equality ^ = a increases unlimitedly then in 

the same process the variable a approaches zero (or, as in Sec. 1, 
if a; is a “practical” infinitely largo variable then a is “practically” 
infinitesimal). All formulas containing the symbol of infinity oo 
should be understood in a similar way. For instance, the formula 

t a n £ = ±oo is a conditional and abbreviated form of expressing 

u 

the fact that when the variable qp approaches ^ in some process then 

the variable x — tan <p increases unlimitedly in its absolute value, 
that is x is infinitely large and so on. This enables us to operate 
on the symbol co as if it were a usual number in many cases but, 
of course, oo is not a concrete number but only a symbol indicating 
infinitely large variables which are different in different circum- 
stances. 

The sum of an infinitely large variable and a bounded variable is 
infinitely large since the first summand “gains over”. The sum of two 
infinitely large variables of a similar sign is also infinitely large. 
In contradistinction to it the sum of two infinitely large variables 
of opposite signs may not be infinitely large since these infinitely 
large variables may “compensate” mutually. These facts are written 
as oo -f- oo = oo; the expression oo — oo denotes an indetermi- 
nate form. This shows that it is impossible to operate on the symbol 
oo as on a usual number in all cases; not always oo — oo = 0 since 
oo — oo is an abbreviated and conditional way of denoting a diffe- 
rence of the form X — Y where X and Y are infinitely large variab- 
les. The behaviour of these infinities may vary in different cases 
and therefore it is impossible to have a good judgment on the be- 
haviour of their difference unless an additional investigation has 
been carried out. We shall discuss indeterminate forms of various 
types in detail in our course later on. 

The product of two infinitely large variables is an infinitely 
large variable. Moreover, the product of an infinitely large variable 
by a variable which is larger than a positive constant in its absolute 
value is an infinitely large variable. At the same time, the ratio 
of two infinitely large quantities is an indeterminate form like the 
ratio of two infinitesimals. 


§ 2. Limits 

4. Definition. It is said that a variable x approaches ( tends to) a 
finite limit, a in some process if a is constant and x approaches a un- 
limitedly in this process. Then we write 

x — > a or lim x = a 


8-0141 
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Thus, the limit of a variable in case it exists is a constant value. 

According to the above definition infinitesimals are variables 
that approach zero, that is having zero as a limit. But infinitely 
large variables, of course, have no finite limit. 

To say “x approaches a unlimitedly” is to say “the difference 
between x and a approaches zero unlimitedly”, that is a: — a = a 
is an infinitesimal. The last equality may be rewritten as 

x = a -f- a 

where a is the infinitesimal. 

If a variable x approaches its limit a hut always remains smaller 
than a, that is approaches a from the region of smaller values, then 
we conditionally write x — a — 0 or lim x = a — 0 (this is a 
conditional way of denoting the limit since if we understand the 


/ ll HI 



Fig. 92 


expression a — 0 as a real difference then a — 0 = a). If a; in its 
process of approaching a always remains larger than a then we write 
x a -f- 0. Finally, x may tend to a in such a way that it could 
take on values larger than a and values smaller than a all the time 

(such a process resembles damped 
I gn — oscillations). All the cases described 

I — - here are depicted in Fig. 92. 

x p ct 7 x Now we can sum up our discussion 

on the types of variables. A variable x 
Fig. 93 may be of one of the following types in 

a certain process: 

(1) x is bounded and has a limit; in a special case when the limit 
is equal to zero x is an infinitesimal variable. To distinguish between 
these cases a bounded variable is sometimes called finite only if 
it is not an infinitesimal; for instance, it is possible to speak about 
an infinitesimal mass and about a finite mass etc. 


(2) x is bounded but has no limit; as an example we may consider 
the deviation of a pendulum from its equilibrium position in the 
case of undamped oscillations. Variables of this type are called 
oscillating (see Fig. 93). 

In Fig. 93 we see the point a which possesses the following pro- 
perty: the variable x approaches the point a infinitely many times 
in the process of its change and takes on the values that are arbi- 
trarily close to a but at the same time x does not remain near a. 
all the time. In this case a, is called a limit point of the variable x. 
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There are the greatest value q and the least value_p_among these 
limit points, q and p are denoted, respectively, as lima: and lima 
and are called the limit superior and the limit inferior of the variable 
x But in this case x does not have a “unique” limit which we dis- 
cussed in the beginning of this section. Therefore we should remark 
that the everyday notion of a “limit” (in the sense of some kind 
of a “border”, “frontier”) differs from the mathematical one. 

A bounded variable x always has the limit superior and the limit 
inferior and lim x ^ lim x. The “unique’ limit, that is the limit 
in the sense ofour previous definition, exists if and only if lim x = 

= lim x. 

(3) x is unbounded and besides infinitely large. In this, case we 
write lim x = i°°) and x is said to have an infinite limit. 



Fig. 94 


(4) x is unbounded but not infinitely large. The deviation ofjan 
oscillating body in the case of a resonance may serve as an example 
here. Such a variable oscillates and from time to time travels “to- 
ward infinity” further and further but at the same time it permanent- 
ly returns to regions lying near the initial point (see Fig. 94). 

5. Properties of Limits. 

1 • If a variable has a limit then this limit is unique , i.e. there 
exist no other limits of the variable (the property is obviously im- 
plied by the definition). 

2. The limit of a constant equals the constant (the property is ob- 
vious). 

3. If x a and y b in one and the same process then x + y ->• 
a ~f“ b. This can be written in a different manner as 


lim (x -f y) = lim x -f- lim y (1) 

and formulated as follows: the limit of a sum is equal to the sum of 
the limits. To prove this let us write x = a + a and y — b + B 
where a and p are infinitesimals. Then x + y = (a + b) + (a -f- B). 
Hence, the variable x + y is represented as a sum of the constant 

a + b the infinitesimal a + p (see Sec. 2). Therefore, lx 4- y) 

-*■ [a -f- b). 

The result thus obtained may be interpreted “practically” as the 
following example: if 3.002 3 and 2.001 « 2 then 5.003 ^ 5 
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4. The limit of a product equals the product of the limits. A more 
complete statement is the following: if the factors entering in a pro- 
duct have limits then the whole product also has a limit which is equal 
to the product of the limits of the factors. Indeed, using our previous 
notation we have xy = ah -f- (a|5 + ba -}- cep) ab, that is 

lim (xy) -- lim x lim y (2) 

Here, as in property 3, we have taken only two variables but 
it is easy to verify that these properties remain true for any arbit- 
rary finite and constant number of variables. For example, 
lim ( xyz ) = lim [(xy) z] — lim (xy) lim z = lim x lim y lim z 

5. A constant factor may be taken outside the sign of the limit, that 
is lim (Cx) = C lim x where C = const. This property follows from 
properties 2 and 4. 

6. The limit of a ratio is equal to the ratio of the limits, i.e. 

with the exception of those cases when both the numerator and the 
denominator tend to zero, that is when we have the indeterminate 

form 

To prove this we first suppose that lim y — b =/= 0. Then 


x a-f-cc a , 

1 a-rOL 

M . 


ab~f,a 

y b- j-(5 b ' 

\ b- J-p 

b) 


b(b~ P) 


The numerator of the last fraction is infinitesimal whereas the 
denominator zz b 2 — const =A= 0 and therefore the whole fraction 
is infinitesimal while the first fraction is constant. This implies our 
assertion. 


In case lim y — 0 and lim x ^ 0 


we 



±oo 


(see Sec. 3). Therefore we obtain + w on either side of formula (3). 

By Sec. 3, formulas (I), (2) and (3) hold not only for finite limits 
but also for infinite ones with the exception of those cases when 


there are indeterminate forms of types oo — oo, 0-oo and — on 


the right-hand sides. These forms will be discussed in Secs. III. 3 
and IV.4. 

7. If x a and a > 0 then x becomes and remains greater than 
zero (x > 0) as the process of its change lasts sufficiently long (i.e. 
from some moment on). This obviously follows from the definition 
of a limit. 

8. It is permissible to pass to the limit in an inequality: il x y 
then lim x lim y (naturally, if these limits exist). Indeed, let 
us denote z = y — x. Then z 0 and therefore z cannot approach 
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a constant which is negative. Hence, lim z >- 0, lim (y — x) 0 
and lim y — lim x > 0. 

We remark here that if x < y then after the passage to the limit 
we can obtain either lim x < lim y or lim x — lim y because the 
difference between x and y may tend to zero. Thus, we cannot retain 
the strict inequality unless an additional investigation^has been 
carried out. 

9. If x ^ y ^ z and we have x -v a and z ->■ a in one and the 
same process then y a (see Fig. 95). 

10. If a variable x increases monotonically then it either increa- 
ses unlimitedly, i.e. x— »--f-oo, or is bounded and then has a finite 
limit: x-^a — 0 < oo. If, in addition, x has an upper bound A 



5 ^ 
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5 =^ 
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" a a ..b 


x ci 


Fig. 95 Fig. 96 

(i.e. x A) then lim re = a ^ A. A monotonically decreasing 
variable behaves similarly. These obvious assertions (see F.ig. 96) 
are indeed the expression of an essential property of the “complete- 
ness 1 ' of the totality of all real numbers. If we used only rational 
numbers all the preceding properties of limits would remain true 
with the exception of property 10 since rational values may 
lead to an irrational result when passing to the limit. The rigorous 
justification of property 10 may be found, for example, in [14]. 

Thus, a bounded monotonic variable must have a finite limit; a 
bounded but non-monotonic variable may have no limit (see Sec. 4). 

To conclude the section we point out that the dimension of a 
variable quantity remains the same when we pass to the limit: 
if x a then [a:] = [a]. 

The first attempt to create the theory of limits was made by 
Newton in 1686 but in fact the operation of passing to the limit 
had been used earlier beginning with Greek scientists. The notion 
of a limit which is close to the one used in this book was formulated 
m 1765 by J. D’Alembert (1717-1783), a French mathematician, 
philosopher and enlightener of the pre-revolutionary period 
m France. 

6. Sum of a Numerical Series. The idea of a limit is directly 
applicable to an important notion of a sum of a series. As a preli- 
minary let ns introduce the following abbreviated notation: 

<7 

a p ~f flp+i + a p+z + . . . + a q-i + a q = 2 

h=p 


(4) 
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Q 

Here 2 i s the summation sign which indicates that we should 

k=p 

substitute k = p, p + 1, • • q into the expression following 
the sign and then sum up the results (2 is the Greek letter sigma), 
an is the general term, or general element, or fcth term of the sum 
(of the series), k is the number of the term (the index of summation), 
p and q are, respectively, the lower and the upper limits of summation 
showing the range of the index k. For example, 

2 ir = - 32 '+' 42 '+-p-+-gr+-p- + -p-( = 0.2774) 
ft— 3 

We should point out immediately that the sum does not depend 
on the notation of the index of summation, that is 

Q <1 <? 

2 = 2 & > = 2 g.-= • • • 

h=p i=p j=p 

Virtually, all these sums are equal to the left-hand side of (4). Thus, 
the summation index is a dummy index, that is it does not enter in 
the result and may he denoted by any letter. 

Now we turn to “infinite sums” or, more precisely, to the notion 
of a numerical series. A numerical series is an infinite expression 
of the form 


a l 4" a 2 -|~ • • • 4“ a n "t" • • • — 2 a h (5) 

ft=l 

and the summands a u a 2 , a 3 , ... are certain numbers called the 
terms of the series. To define the sum of series (5) it is necessary 
first to compose the so-called “partial sums” of series (5): 

n 

Si — ci\, 1S2 = <zi -|- o-z, S 3 = Cj -{- Cg-f- g 3 ; • • ■ ; S n = 2 a h) 

h= 1 

If the nth partial sum tends to a certain finite limit as the number 
n increases then series (5) is called convergent and its sum S is 
understood as 

00 n 

S — 2 a h = ^ 2 a b 

h=l n— >03 n-> 00 k = i 

(The inscriptions in small type under the sign lim and, in other 
cases, under the sign -a- indicate the process in which the limits 
are considered.) Partial sums of a convergent series which have 
large numbers are practically equal to each other and to the whole 
sum of the series. If there exists no finite limit of partial sums series 
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® is called divergent. In particular, if partial sums approach infi- 
nity series (5) is said to diverge to infinity; in this case we write 

co 

°° (°r — OO ) 

*=i 

A divergent series has no finite sum. 

The product of an infinite number of factors is determined in a 
similar way. The same manner of reasoning applies to any infinite 
process: first a finite process is performed and then the passage 
to the limit is carried out. 

Let us consider, for example, the series 

1 +y+-p-+ • • • + 3^+ • * • (®) 

Using the formula for the sum of a geometrical progression we obtain 

1 — L 

( A .f A \ 

d +'q- + '32' - i" * * * + 3,1-i ) =li m ~ = 

0 0 0 ' n-+CB A i, 

3 

1 3 


Thus, series (6) converges and its sum equals 1.5. If we calculate 
the partial sum of the first ten terms we receive approximately 
1.499975. 

In like manner the series 

a + aq -f- aq 2 + ... + aq n + . . . (7) 

converges for | q | < 1 and its sum (the sum of an infinitely decrea- 
sing geometrical progression) is equal to a (1 — q ) -1 . 

The nth partial sum S n of the series 

1 -f 1 -f 1 + ... +1+ ••• (8) 

equals n and therefore it tends to infinity. Hence, series (8) diverges 
to infinity. Similarly, — 1 — 1 — 1 — ... — 1 — ... = — oo. 
Partial sums of the series 

1 — 1 -J- 1 — ... 4- (-1)" +1 + . . . (9) 

are equal, in succession, to S ± = 1, S 2 = 0, S 3 = 1, S 4 — 0, . . ; 
and they have neither a finite nor an infinite limit but remain boun- 
ded and. oscillate without damping (compare with the end of Sec. 4, 
Lase 2). Thus, series (9) diverges in an “oscillating” manner. 

If we drop or add a term in a series (5) this will not affect the very 
tact of its convergence or divergence, that is if series (5) converged 
beiore, it will converge now though its sum may change and if series 
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(5) did not converge it will diverge after the operation. Indeed, if 
we take the series a 2 -f- a 3 + . . . + a n + . . . and consider it 
together with (5) then its partial sums will differ from the corres- 
ponding partial sums of (5) in the constant number and therefore 
if one of the sums approaches a limit the other does the same. Repea- 
ting such droppings or additions of a finite number of terms of series 
(5) we come to the conclusion that an arbitrary change of a finite 
number of terms of series (5) does not affect its convergence or di- 
vergence. 

If series (5) converges then the series R n — a n+1 + a n+ 2 + 
+ a n + 3 -f • • . also converges (why is it so?). Its sum is called the 
remainder (remainder term, remainder after n terms, or “tail") of 

oo 71 co 

series (5). It is clear that 5 = 2 = 2 a h + 2 a h — S n + Rji- 

h— 1 ft— 1 ft=n-f-l 

It follows that the remainder of a convergent series tends to zero 
as the number n increases since it is the difference between the nth 
partial sum of the series and the limit of the partial sum. 

We shall now estahlish the necessary condition (test) for the con- 
vergence of series (5). Since 5 n _j = a { -j- a 2 + . . . + a n -i and 
S„ = + • • • + « n -i + a n we have a n = S n — S n and 

therefore 

a n v 5 — 5 = 0 (10) 

7l-*oo 


if series (5) converges. Thus, the general term a n of series (5) tends 
to zero as the number increases. This condition is not at all sufficient 
for the convergence; for example, the condition is fulfilled for the 
series 


1 T 


l 

1/2 




but the series diverges to infinity. The divergence is implied by the 
fact that 


5 n = 1 - 


1/2 1/3 


> 


1 / n 1 /re 1/ n 


- + 


n tiroes 


= n ■ 


-2-=l^n 

V n 


ys 


We shall systematically study series of different types in Chapter 
XVII where, in particular, some rigorous sufficient conditions 
(tests) for their convergence will be formulated and proved. Yet 
we are going to make use of some series before that as the question 
on their convergence may he settled in some practical sense, although 
not rigorously enough, in the following way. We compute the terms 
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one after another and when we see that soon enough, from some 
moment on, their values become less than the required degree of 
accuracy of calculations and that there is no reason to expect that 
the addition of the following terms may noticeably change the sum 
we simply drop all the subsequent terms. This means that we replace 
the series by a finite number of its terms (i.e. by a partial sum). 
In such a case we may say that the series is “practically conver- 
gent". 

§ 3. Comparison of Variables 

7. Comparison of Infinitesimals. Comparison of two infinitesimals 
with each other is carried out by investigating their ratio. Let 
a 0 and p # 0 be two infinitesimals varying in one and the 
same process. Then we can have the following cases. 

(1) If — 0 then p is said to tend to zero faster than a or that 

p is an infinitesimal of higher order than a and a is an infinitesimal 
of lower order than p. This fact is written in the form | P I <C I a ! 
or | a | ]|> | P |; the symbolic equality p = o (a) is also used for 
this purpose. Hence, in this case p is not only an infinitesimal but 
is also an infinitesimal part of the other infinitesimal a. For example, 
let o be the volume of an infinitesimal cube and cr the volume of 
a right prism with the same base and with the constant altitude a. 
Then | co | ^ | o | since if h denotes the edge of the cube then 

co = /( 3 , a — ah 2 and — = = — — >-0 as h— >0. 

a ah - a 

(2) If A_^oo then *0, i.e. |a|<|p|. 

(3) If the ratio — approaches a finite nonzero limit then a and p 

are called infinitesimals of the same order; in this case neither of the 
variables a and p can become much smaller than the other. In 
particular, in case this limit is equal to unity the variables a and 
P are called equivalent infinitesimals; then we write a ~ p. For 
example, the infinitesimals x and x -f- x 2 are equivalent as x — v 0 
vhereas the infinitesimal variables 2x and x + a: 2 are of the same 

order in this process but not equivalent since - jfe- »-2. 

m ls P oss ihln 1° verify the following properties. 

(1) If a and p are of the same order and | y | < | a j then | y | 

■C | p |. 

(2) If a and p are of the same order and p and y are also of the 
same order then the? same is true for a and y. 

i R ^. a anc ^ P are same order and have the same sign then 

a .+ P * s same order as a and P; in case a and p have oppo- 

si e signs a + P may happen to be of higher order and so forth. 



122 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


For example, let us verify the first property: 

lim X = lim (— ^r) =lim— limX = 0-lim-|- = 0 

(5 \ a p / a p P 

which implies what was required. (Verify the remaining properties; 
by the way, they are quite evident.) 

(4) If ■— -a- k 0 and we denote fi — ka = y, i.e. (5 = ka -f y, 

then | y | « | a |; in other words, infinitesimals of the same order 

are proportional to each other within the accuracy to a term of higher 
order. This follows immediately from 

X = Pn^ == i._ A _ > 0 

a a a 

Such a case of “almost proportional” infinitesimals is sometimes 
■denoted as a p. 

8. Properties of Equivalent Infinitesimals. The following simple 
properties are true: 

(1) if a ~ p then p ~ a; 

(2) if a — P and p ~ y then a ~ y; 

(3) if a ~ p then a — p -f Y where | y ] < j a | (and | y | < 

< | P |); in other words, the difference between equivalent infinite- 
simals is an infinitesimal variable of higher order. Conversely, if 
a == p + y where | y | < | p | then a ~ p which means that the 
addition of a variable of higher order to an infinitesimal results 
in a variable equivalent to this infinitesimal; 

(4) if a — ccj and P ~ P 4 then lim ~ — lim ^ where x and y 

are arbitrary variables or numbers; this means that when calculating 
limits it is allowed to replace infinitesimals occurring in the nume- 
rator and in the denominator by equivalent variables. 

All these properties can he verified in a similar way. For example, 
let us justify the fourth property: 

X CC XCCi CC Bt 1*1 * i* xcc 

which im P hes lim ip = 

= lim - X- lim — lim = lim -1-1 

?Pi «i p i 

and so the fourth property is true. 

9. Important Examples. 

1. The length of an infinitesimal arc is equivalent to that of its 

chord, that is 1 as N -v M (see Fig. 97). The explanation of 

the property lies in the fact that a small arc is so short that it has 
no “room” to “crook” noticeably, i.e. to change its direction. There- 
fore, if we watch these elements “through a microscope” so that 
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they should be magnified up to finite sizes the chord will be practi- 
cally indistinguishable from the arc. In more rigorous investigations 
of this obvious property this fact is sometimes regarded as an axiom 




on which the definition of the length of an arc is based. On the other 
hand, this property is sometimes deduced from other analogous 
axioms. 

2. Applying the above result to an infinitesimal arc of a circle 
(see Fig. 98) we derive 

MN 2PN 2R sin x sinx v ^ 

MN _ 2 QN — 2 Rx ~ x *->0 

It was meant that x > 0 but the expression does not change 

when x changes its sign and therefore the sign of x does not matter 
here. Thus 

lim ~x — = 1 (H) 

x-*-0 x 

Incidentally, we see that sin x < x for x > 0 (since MN < MN). 

3. Now let us consider Fig. 99 which repeats Fig. 41 with some 

additional lines. We see that tan p = In (1 ^ + h l . Now if h 0 then 
P -v a = 45° and tan p -> tan a = 1, that is 

V In (1 -f h) . 

lim — -- - - - > — i /icn 

h-> o h K ' 

In Fig 99 it is assumed that h > 0 but the same is true for h< 0 

&nd ti 0. % 
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Here is an important corollary of formula (12). Since (1 + h) h 

In ( 1 +h) 


= e h , we have, as h 0, 

Thus, 


In (1-f/Q 

land* * -* e i = e 
h 


lim (1 4 -h) h — e 
/>— >0 


(13) 


The last limit is sometimes taken as the definition of the number e. 
Many other limits can be evaluated by means of the results repre- 
sented above. Here we also mention that we shall offer a more stan- 
dard and simple method of evaluating limits in § IV.4. 

10. Orders of Smallness. Let a and p be two infinitesimals chan- 
ging in one and the same process. If P is of the order of a k then p 
is said to be an infinitesimal of the &th order relative to a. Here 
the speed at which a approaches zero serves as some standard with 
which the speed of p (as p tends to zero) is compared. 

Examples. Let x 0, i.e. let x be an infinitesimal. We shall 
regard it as a standard. Then if y — 2x 2 we see that y is an infinite- 
simal of the second order (relative to x) since y and x 2 are infinite- 
simals of the same order; if z — 4s 3 -f x 1 then z is an infinitesimal 

of the third order since z and x 3 are of the same order ( 

In general, the sum (or the difference) of infinitesimals of different 
orders is characterized by the lowest order of the infinitesimals. Namely, 
the infinitesimal which has the lowest order of smallness is the 
principal term in such a sum. In other words, all the remaining terms 
are negligibly small relative to the principal one and the sum is 
almost completely exhausted by the principal term when the process 
goes on sufficiently long. Furthermore, if u — V x — x 2 then u 
is of the yth order and hence u is an infinitesimal of lower order 

than x, i.e. — -*• oo and | u 1 j x |. Generally, if the order of 

an infinitesimal is lower than unity the infinitesimal is of lower 
order relative to the standard. Finally, let us take v — 1 — cos x; 
here v is of the second order since 


lim - — 

x-+0 x* 


T 1 — cosx 

lim 

x-»0 x h 


2 sin 2 — 

lim _±- 

x-»0 x h 


2 sin tj- sin -5- : 

= lim = lim • 


£->0 


x->0 


X X 

IT IT 

x k 
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(in the passage to the last term we have used the fourth pro- 
perty from Sec. 8) and therefore to receive a finite nonzero limit 
we should put k = 2. 

If the standard is changed the order of an infinitesimal may also 
change and therefore it is absolutely necessary to indicate the stan- 
dard°variable. For instance, the variable y = x c is an infinitesimal 
of the sixth order relative to x as x -> 0 but it is only of the second 
order relative to f. 

H. Comparison of Infinitely Large Variables. The comparison 
of infinitely large variables is carried out in a way similar to that 
of comparison of infinitesimals. But there exists a certain difference 

in the terminology: thus, if y 0 where x and y are infinitely 

large variables then we say that x is of lower order relative to y 
and y is of higher order relative to x but the notation | x | <C | y \ 
or x = o (y) remains (these forms of writing are used in all cases 

when — ->■ 0 even if variables x and y are neither infinitesimal 

y 

nor infinitely large; we also remark that the notation x = 0 {y) 
indicates that the ratio y is bounded). All the assertions of Sec- 
tions 7, 8 and 10 are transferred with some little changes to infinitely 
large variables. 

To conclude the section we point out an obvious property: if 
lim x = 0, lim y = const =f= 0 and lim | z | = oo then | x | | y | 

and | y | < | z |. 


§ 4. Continuous and Discontinuous F unctions 

12. Definition of a Continuous Function. The definition of a 
continuous function was given in Sec. 1.16. Now we are going to 
discuss it in detail. 

Suppose a function y — f (x) is given. Let its argument first take 
on the value x 0 and then receive an increment Ax, that is let x assume 
a new value x — x 0 + Arc (see Sec. 1.22 on this notation). Then the 
function will also receive an increment 

= y — yo = f (x) — f (x 0 ) = / (x„ + Ax) — / (x 0 ) (14) 

The function / is called continuous at the point x 0 (i.e. for the value 
ol the argument equal to x 0 ) if Ay 0 in a process in which Ax 0 
or, in other words, if the increment of the function is an infinitesimal 
when the increment of the argument is infinitesimal. Otherwise x 0 
1S c . a ^ e ^ the point of discontinuity of the function /. 

. bmc( r f (*) = / (*<>) + Ay [see formula (14)] the condition Ay -> 0 
is equivalent to the following condition: / ( x ) -+■ f (x 0 ). Further, 
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writing x -v x 0 instead of Ax 0 we arrive at the following equiva- 
lent representations of the definition of the continuity: 

f ^ f ^ or lim / (*) = / (*o) 

and, finally, 

lim / (x) = / (lim a:) (15) 

x-+x 0 

This means that the limit of a continuous function at a point equals 
the value which the function assumes when the argument takes on the 
value of its own limit.* 

It should be underlined that the value of a function at a point 
of its continuity cannot he infinite. 

A function which is continuous at each point of an interval is said 
to be continuous over the interval. 

13. Points of Discontinuity. If x 0 is a point of discontinuity of 
a function / then the value / ( x 0 ) of the function is often undetermined 
hut this fact usually does not play any important role. In these cir- 
cumstances the limits of the values of the function / (a;) as x ->- x 0 — 0 
and x -a- x 0 + 0 are essentially important (see Sec. 4). These two 
limits are denoted conditionally as / (x 0 — 0) and / (x 0 + 0), respec- 
tively (see Fig. 100).** 

It may occur that these limits are finite and / (x 0 — 0) = / (x 0 -f- 0) 
though the value / ( x 0 ) may not be defined or it is defined but does- 
not coincide with / ( x 0 ± 0). Such a discontinuity is called removable 
since if we put / ( x 0 ) = / (x 0 ± 0) (this value may be called the 
“true” value of the function / (x) at the point x = x 0 ) then there will 
no longer exist any discontinuity of / (x). We shall give a simple 
example: let the function f (x) be defined by the formula 

/(*)=- ( 16 > 


* Recalling the specification pointed out in Sec. 1 we can now formulate the 
definition of the continuity at the point x B as follows: for any given e > 0 there 
must exist 6 > 0 such that | x — z 0 | < 6 should imply \ f (x) — / (z 0 ) | < s. 
This definition given by A. L. Cauchy (1789-1857), a prominent French mathema- 
tician, is fundamental in books meant for mathematicians. 

Defining the continuity of y=/(z) at i = j 0 as the condition Ay -+• 0 
for Az -> 0 we mean that Ay 0 not only for some certain process in which 
Ax 0 or for several such processes but for all possible processes of this kind. 
Writing (15) we also mean that it holds for all processes in which x -> x 0 . 
Cauchy’s definition takes these facts into account automatically since for every 
process in which x x 0 there exists a moment from which on \ x — x 0 \ < 
< 6 . 

** f (x 0 zp 0) are called, respectively, the limit of / (x) on the left of the 
point x — x 0 (the left-hand limit) and the limit of / (z) on the right of the point 
x = x 0 (the right-hand limit). — Tr. 
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The function is continuous for all x 0 but not defined at x — 0 
since x = 0 cannot be substituted into formula (16) because this 

yields the indeterminate form . But if in addition to formula (16) 

we put / ( 0 ) = 1 then, by formula ( 11 ), the new function f ( x ) thus 
obtained will be defined and continuous for all x without exceptions. 
Thus, we had the removable discontinuity at the point x = 0. From 




Fig. 101 


the geometrical point of view this means that the curve pp (see 
Fig. 10 1 ) “lacked” one point, i.e. the point M. After the point has 
been added to the curve it becomes continuous. 

If the values / (a : 0 — 0) and / ( x 0 -f- 0) are finite but / [x 0 — 0) 

77 / (2 0 + 0 ) then the function / is said to have a point of discon- 
tinuity of the first kind or, which is the same, has a finite jump [the 
term jump is applied to the value A = / (s 0 + 0) — / (x 0 — 0)] 
or a jump discontinuity (see Fig. 100). In case at least one of the 
values / (x 0 — 0 ) or f (x 0 -j- 0 ) equals infinity we say that the func- 
tion, in a conditional sense, turns into infinity at the point x 0 (or 
we say that the graph of the function “travels into infinity”). For 

, i ' 

instance, the function / (x) = — - behaves in this way at the point 
2 0 = 0 . 

In conclusion we remark that in some cases / (x 0 — 0) or / (x 0 -f 0) 
nas neither a finite nor an infinite value since a variable may hava 

neither a finite nor an infinite limit. For instance, we have — -v 00 


as x 


0 and therefore we see that the function / (x) — sin — will 
infinitely pass from -1 to +1 and back to -1 as^O and thus 
/ ( 2 ) = sin — has neither a right-hand limit nor a left-hand one 
(and, m general, it has no limit) at x = 0 (see Fig. 102). 
w} . Var J able ] quantities of physical nature have discontinuities 
a certain kind of action is suddenly applied or switched off, 
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when there is a transition from one medium into another (through 
the interface between the two media), when the law of functional 
relationship between the quantities suddenly changes etc. 

See, for example, Fig. 103. There is a graph of variations of the 
electric current flow i plotted against time t there which corresponds 



Fig. 102 

to the process of transmitting the letter “a” in the Morse code by 
radio (the signal “dot — dash”). Thus Fig. 103 shows the dependence 
of i on t. As is seen, we have here a function with four points of 
discontinuity and at each of these points there is a finite jump cor- 
responding to switching on or switching off the constant emf (electro- 
motive force) in the circuit. It is important to pay attention to the 



fact that if we analysed this phenomenon more carefully (and for this 
purpose took a much larger time scale for the f-axis) then we should 
see that in reality the growth of the current is similar to the one shown 
in Fig. 104. Since there is always a certain inductance in the circuit 
the current increases continuously (although very fast) and therefore 
in real circumstances there must be no discontinuity of the function 
i (t) at all! In some cases when, for instance, a pulse lasts a very 
short time it may be important to take into account the continuity 
of this transient process (for example, corresponding to the part OA 
of the graph). But when the transient process is of no importance 
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it is simpler to schematize the process and consider the function 
i ( t ) discontinuous according to Fig. 103 if this does not lead to noti- 
ceable mistakes. Thus, one and the same function i ( t ) of a real 
“physical” nature may be regarded as continuous or discontinuous 
depending on whether we intend to take into account the transient 
process or not. In the case of passage from one medium into another 
an analogous role is played by the processes near the interface bet- 
ween the media which may or may not be taken into account. 

If an elementary function is considered then, as it will be shown 
in Sec. 14, it can have a discontinuity at x — x 0 if and only if the 
substitution of x = x 0 into the function yields an expression 

of types In 0 or 0° in the expression of f {x 0 ) or in some part 

of this expression. G. H. Hardy (1877-1947), an English mathemati- 
cian, showed that in case such a situation takes place the limits 
/ (x 0 — 0) and / ( x 0 + 0), finite or infinite, should exist provided 
the function / is defined, respectively, on the left or on the right 
of Xq. Only in case the expressions sin oo and cos oo enter into the 
representation of / ( x 0 ) there may be an exception to this rule. 

If a function / is defined only on one side of x 0 , for example, on 
the right, then it may happon that only / ( x 0 + 0) (“the end-point 
value”) exists while / ( x 0 — 0) does not. The limits / (— -oo) and 
/ (+oo) may also be regarded as end-point values. 

14. Properties of Continuous Functions. 

1. The sum of two continuous functions is a continuous function. 
Virtually, if the functions f i (x) and / 2 (x) are continuous and / (x) = 
— fi ( x ) + fo {x) then, as x -*■ z 0 , 

lim / (x) = lim [/ t (x)-+ / 2 ( 3 )] = lim f t (x) + lim / 2 (x) — 

= fi ( x o) + /a (*o) = / (*o) 

which implies the continuity of the function / (x) (see Sec. 12). 
We point out that in the above proof we first used a property of 
limits (see Sec. 5) and then the continuity of the functions /j and / 2 . 
Similar application of other properties of limits implies the following: 

a sum or a difference or a product of an • arbitrary number of con- 
tinuous functions is also a continuous function; 

a ratio of two continuous functions is a function which is continuous 
everywhere except the points where the denominator equals zero. The 
ratio either approaches infinity or becomes an indeterminate form 

type q at those points where the denominator vanishes. There- 

^ 01 „ e the continuity does not hold in either case. 

• A composite function formed by means of continuous functions 
is a continuous function. Indeed, if the functions z ( y ) and y (x) are 
ontmuous and x assumes an infinitesimal increment then, by the 
ontmuity of the second function, the increment of y will be infi- 
9-0U1 
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nitesimal too and therefore the increment of z will be also infinite- 
simal byjthe continuity of the first function; thus, the composite 
function z ( x ) will be continuous. 

The first two properties enable us to make the following conclu- 
sion concerning the continuity of elementary functions. Reviewing 
the basic elementary functions (see Sec. 1.18 and § 1.4) we see that 
among them only y = x~ m has a discontinuity for —m ■< 0 at x = 0 

(in this case the form g appears), y — log a x has a discontinuity 
at x — 0 (this yields log 0) and z/=tan x at x = ± tt, ± y , ... 

. A 

Sm 2 1 

(this results in — - = ^). When composite functions and algeb- 

cos ! 

raic combinations are composed of basic functions then, according 
to properties 1 and 2 and Sec. 15 (1), the new points of discontinuity 

may appear if and only if the expressions of the form ^ and 0° 

occur. This proves the assertion stated in the end of Sec. 13. 
Hence, if / (x) is an elementary function then lim / (x) simply 

X-KT 0 

equals / (x 0 ) provided there are no “dangerous” expressions of the 
form In 0 and 0° in the expression of f (x); for example, 


In (1 + sin x) ln(l-fsinl) 
x*-2x _ 12-2-1 


— In (1 -f-sin 1) = 


= - In 1.8415= -0.6105 


(the values of sin 1 and of In 1.8415 are taken from tables). This 
rule of determining a limit remains true for the operations on in- 
finities provided after substituting the limit of the argument the 

indeterminate forms g, 00 — oo, =§-, 0-oo (which were mentioned 

in Sec. 5), 0°, l” , oo° [which will be discussed in Sec. 15 (1)] and 
expressions of the form sin co do not appear. For example, 

J™ q (~ + cosx)=^^ + oos( + 0) = =^+1 = -°o, 

1 _ 1 

lim {x x -^-x x ) = ( + 0) +c °-}- ( + 0)~°° = 0-}- oo = oo 

x->-rO 

More on the indeterminate forms see in § 3 and § IV.4. 

3. As it was indicated in Sec. 1.16, the graph of a continuous func- 
tion y = / (x) defined in an interval a ^ x ^ b consists of one com- 
ponent. The consideration of such a graph (see, for example, Fig. 105) 
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shows that a continuous function defined over a finite interval (inclu- 
ding its end-points) is bounded and attains its least (in the algebraic 
sense) value (at x — a in. Fig. 105) and its greatest (at x — c) value . 
These values are denoted, respectively, as min / (a:) and max / (x) 

(abbreviations of “maximum” and “minimum”). Besides, such a func- 
tion takes on all the intermediate values between f (a) and / (b), each 
value being taken at least one time. For example, the value y ~ q 
is assumed only one time in Fig. 105, i.e. at x — 8 , whereas the value 
y = p is taken three times at the points x = a, ft and y. Further, 



Fig. 105 Fig, 106 


if f ( x ) is positive for some value x — x 0 it remains positive at all points 
x ying close enough to x 0 . Finally, turning back to Figs. 25 and 26 
we see that if a continuous function is monotonic then its inverse func- 
tion is also continuous ( and monotonic). 

Here we shall restrict ourselves to the visual considerations con- 
cerning these properties. Their rigorous proof does not appear simple 
in the general case; this proof can be found, for instance, in (14], 
It should be pointed out that a continuous function is not necessa- 
rily smooth, that is having a graph with a certain tangent at its 
every point (a smooth function is shown in Fig. 105). On the contrary 
as it is shown in Fig. 106, it may happen to be piecewise smooth’ 
tuat is to have a broken graph consisting of several smooth arcs, 
kven some more complicated cases are possible but we are not going 
to treat them in our course. Some more detailed comments on this 
question will be given in Sec. IV.3. 

15. Some Applications. 

tl . V,, °t Co ™-V r °site Exponential Expressions (“Power-Exponen- 
tial Expressions). Let us consider the expression x» where x -y a 
? tA," Let us re P res ent x v in the form xv = 

• • But In x -> In a by the continuity of the logarithmic 
unction and hence y In a -v 6 In a (as a limit of a product) TW 

tial ff P v !n x)~y exp (b In a) (by the continuity of the exponen- 
tial function). Thus, *.-**■«. = in Sther^S, 

lim xv — (lim a:) lim y 


9 * 
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and so we see that it is permissible to pass to the limit in the expres- 
sion xv. The exception to this rule is the case when the product b In a 
becomes an indeterminate form, i.e. it has the form 0-oo. This 
may occur if 


© 

II 

<3 

6 = 

oo, 

i.e. 

a 

II 

and 

b A OO 

a = 1 ; 

- lna= oo; 

b = 

0, 

i.e. 

a — oo 

and 

a b = oo° 

In a = — oo; 

b= 

0, 

i.e. 

a 

II 

o 

and 

a i = 0° 


Hence we just have the three types of the indeterminate forms men- 
tioned in Sec. 14. 

One may sometimes think that there must be 1°° = 1 since “unity 
to any power equals unity”. But 1“ is not at all unity to a certain 

finite power but only the abbrevia- 
ted notation for a limit of an expres- 
sion of the form x v where x 1 
and y ->■ oo. Suppose, for example, 
that x — >■ 1 +0, that is x >1. 
Then the expression x v “has a cer- 
tain tendency to approach 1” (since 
V — 1) and at the same time it 
“wants to tend to infinity” (since 
x > 1 and x°° = oo because if we 
raise a constant number larger than 
unity to an infinitely increasing 
power we shall arrive at an infini- 
tely large variable). Therefore, 
these two tendencies “act” upon the expression in the opposite 
directions and hence the result may be different in different prob- 
lems depending on which of these tendencies “wins”. For example, 
in case (13) the limit turned out to be equal to e whereas the imme- 
diate substitution of h = 0 yields 1°°. In this case we see that the 
tendencies are equal in a certain sense; they are in a state of “balan- 
ce”. Similar conclusions may be derived in connection with other 
indeterminate forms. 

2. Solving Inequalities. Let a function / (x) be considered on an 
interval (a, b) (in particular, the interval may be the whole x-axis) 
and let it be necessary to solve the inequality 

/ (x) >0 (17) 

that is to determine all the values of x for which it holds. In the geo- 
metrical sense this means that we must find such regions on the 
x-axis where the graph of the function y = / (x) lies over the x-axis 
(such regions are shaded in Fig. 107). But it must be stressed that 
in this problem we regard the function / (x) as known whereas its 
graph may be unknown. 
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To solve inequality' (17) let us mark all the zeros of the function f 
(that is the points where f vanishes) on the interval (a, b) and also 
all the points of discontinuity (there are three zeros and one point 
of discontinuity in Fig. 107). The interval is divided into several 
parts by these points (five parts in Fig. 107). Since we have taken 
into account all the points of discontinuity of the function it is 
continuous inside each of these parts. Besides the function does not 
vanish inside the parts because we have reckoned all its zeros. Thus, 
the function y — f (x) retains its sign inside each of the parts (see 



-3-2-1 0 1 2 x 


Fig. 108 


property 3 in Sec. 14). Now in order to determine this sign it is 
sufficient to determine the sign of the function at any point inside 
the considered part of the interval. After this operation we choose 
those parts in which the function is positive and thus the inequality 
(17) will be solved. 

Example. Let us solve the inequality >0. 

X“ — — O / 

The numerator equals zero when x — 1 and therefore it is divisible 
by x — 1 . This implies that the expression 

23 + 3x2-4 (x~l) (x 2 + 4x+4) (x— l)(x+2) 2 

x 2 — 3 ~ x 2 — 3 ~ x 2 — 3 


must be positive for the values of x which we are interested in. Hence, 
the function is defined over the whole a;-axis except x = ±V 3 and 
has two zeros (x = 1 and x = — 2) and two points of discontin- 
uity ( x — ±1^3). These points break the z-axis into five parts (see 
Fig. 108). Now we choose a point in each of the intervals, substitute 
these values into the last fraction and determine the signs of the 
fraction inside the intervals (the numerical values themselves do not 
matter and only their signs are essential). Thus we receive the table 



Thus, the solution of the inequality is a totality consisting of 
two intervals; 


~Y 3 < x < 1 and ]/3 <i<oo 











CHAPTER IY 


Derivatives, Differentials, 
Investigation of 
the Behaviour of Functions 


§ 1. Derivative 

1. Some Problems Leading to the Concept of a Derivative. We 
come to the notion of a derivative, one of the most important notions 
in mathematics, when investigating the rate of change of a function. 

For example, let us turn to the notion of the velocity (rate) at 
a given instant of a rectilinear motion of a material point. A mate- 
rial point is understood in physics as a material body such that it 
is permissible to neglect its geometrical sizes while investigating 
the state of the body under some concrete conditions. In different 
circumstances a particle of a substance, or an airplane, or a heavenly 



S AS 


Fig. 109 

body etc. may sometimes be regarded as a point. Let a material 
point move along the s-axis from left to right. In the general case 
the motion may be non-uniform, that is the velocity of the motion 
may be variable. The law of motion is expressed mathematically 
as a dependence of the coordinate s on time t: s = f (t). Since the 
velocity is variable the ratio of the distance passed over to the 
time taken represents the average velocity only. As for the “true” 
velocity, that is the velocity at a given instant, it can be obtained 
by means of the following procedure. Let the moving point occupy 
the position A (see Fig. 109) at an instant t. Suppose during the period 
of time A t (see Sec. 1.22 on this notation) the moving point transits 
to the position B, the distance As being passed over. Then 

s — / (/), s + As = / (* + AJ) 

i.e. As = f (t + Ait) — / ( t ). Hence, the ratio v av — (which 
is the distance passed over per unit of time taken) is the average 
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velocity of motion during the time period Af from t to t + At. 
Now the instantaneous velocity of motion at time t is obtained as the 
limit of the average velocity in the process of decreasing the interval 
A t unlimitedly , that is 


Uinst — lim v av — lim 

At-*0 A/-+0 


As 
A I 


lim 


/(h-aq — no 

At 


( 1 ) 


It is also said that the instantaneous velocity (that is the velocity 
at a given instant, the' true velocity) is the average velocity during 
an infinitesimal interval of time (“element” of time) or that the 
instantaneous velocity is the ratio 
of an infinitesimal distance to an 
infinitesimal time interval. Both 
definitions briefly express the mea- 
ning of the general definition (1). 

The rate of a physical process is 0 
not in all cases represented by the 
distance passed over related to the F>g- HO 

unit of the time taken. Let us con- 
sider, for example, the process of filling a vessel. In this case 
the dependence V — f (t) of the volume already filled on time t 
expresses the law of the process of filling. The average rate of filling 
during the interval of time from t to t -f- At is represented by the 
ratio 



AF_/(f + A0 — fit) 
Wa0 ~~Af A l 


whereas the limit 


tOinst — lim Wan 
At-vO 


lim —r— = lim 

At-+0 Af-+0 


f ( t + Al)~f{t) 
At 


( 2 ) 


serves as the instantaneous rate, i.e. the rate of filling at time t. 
Thus we have arrived at an expression similar to (1). 

But we can understand the velocity, the rate, even in a wider 
sense relating the change of a quantity not to the unit of time but 
to the unit of some other quantity. For example, let us consider 
the notion of the linear density of a material line, that is of a body 
such that, under given concrete conditions, it is permissible to take 
into account only its size in one-dimensional extent (the longitudinal 
size) neglecting the cross-section sizes. At the same time we do not 
neglect its mass. If this line (“thread”) is homogeneous its linear 
density is equal to the ratio of its mass to its length. In case the 
thread is non-homogeneous its linear density is different at different 
points. Let us reckon the distance from one of the ends of the thread 
(see Fig. 110) and let the mass of the part of the thread corresponding 
to the distance s be equal to M ~ f (s). If now some additional 
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distance As is passed the ratio 

AM /(s + As) — / (s) 

PaD "~ As *~ As 


represents the average linear density of the thread corresponding 
to the part AB. The limit 


p = lim p at) = lim - 

As— >-0 As-»>0 


lim 

As-vO As 


now gives the linear density of the thread at a point (namely, at 
the point A). We may say that p is the rate (velocity) of change, 
of the mass of the thread, i.e. the change of the mass per unit of 
distance passed. 

2. Definition of Derivative. From the mathematical point of 
view expressions (1), (2) and (3) are quite similar. This enables us 
to state the following definition. Let a function y — f (x) be given. 
Then the rate of its change related to the unit of change of the argu- 
ment x is equal to 

y’ = lim M. = Mm ) T . ^f ) 

Ax->0 A* Ax-*0 Ax 


This rate (velocity) is called the derivative of the variable (func- 
tion) y with respect to the variable (argument) x; in other words, 
the derivative is the limit of the ratio of the increment of the function 
to the increment of the argument taken in the process when the incre- 
ment of the argument approaches zero. Since this rate has, in general, 
different values for different values of x the derivative itself is a 
new function of x. This new function is designated as y' = f {x). 

Hence, in the examples of Sec. 1 the velocity of motion is equal 
to the derivative of the distance passed with respect to the time, 
i.e. v = s t (the subscript t in the expression s t indicates that the 
derivative is taken with respect to the variable t) etc. 

For example, let us compute the derivative of the function y = ax 2 . 
Increasing the argument by an increment Ax we receive the new 
value of the argument x + Ax and the new value of the function 
y -J- Ay — a {x + Ax) 2 since x -{- Ax should be substituted for x 
into the expression of the function. Thus, 

Ay = a (x + Ax) 2 — ax 2 = 2 ax Ax + a {Ax) 2 


This implies 

y' — lim -jp-= lim ? ax —~. — lim [2ax-\-a Ax) = 2ax 


Note that in the latter passage to the limit only Ax varied as Ax -> 0 
whereas x was considered to be constant. The result thus obtained 
can he written in the form {ax 2 )' = 2 ax. 

We leave it to the reader to verify that {ax?)' — 3 ax 2 , {ax)' = a 
and the like. We particularly note here that x' — 1. 
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3. Geometrical Meaning of Derivative. Let us consider the graph- 

of a function / (x) (see Fig. 111). We see that || = jjp = tan 

i.e. the ratio is equal to the slope of the secant mm. If Ax 0 then 
the secant turns round the point M and tends to the position of the- 



ous property -which -we have already used is in fact nothing but the 
definition of a tangent.) Therefore 

y'o = lim — - = lim tan 0 == tan a [y’ 0 = f (x 0 )] (4) 

Aar-*0 “ 

that is the geometrical meaning of the derivative of a function is that 
it is equal to the slope of the tangent. By formula (11.21) it is easy 
now to put down the equation of the tangent ll: 

y — yo = y { J (* — x 0 ) (5) 

where x 0 and y 0 are the coordinates of the point of tangency, x and y 
are the moving coordinates of the point on the tangent straight line. 

Similarly, the equation of the normal to the curve, that is of 
the line perpendicular to the tangent at the point of tangency, has- 

the form y — y 0 ~ — (as — x 0 ) (see problem 5 in Sec. II. 9). 

In Sec. 1.26 we said that the angle between two curves at the point 
of their intersection was defined as the angle between the tangents 
to_the curves at that point, and therefore we are able now to deter- 
mine the angle by means of formula (11.23) since we know how to> 
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determine the tangents. Note that the angle may turn out to he 
zero in case these curves are tangent to each other, i.e. when their 
tangents coincide. 

When the graph of a function y = / (x) is given the geometrical 
meaning of the derivative makes it possible to indicate the slope 
of the tangent to the graph and this enables us to draw immediately 
•a sketch of the graph of the derivative (see Fig. 112). For more accu- 
rate “graphical computation of the derivative” it is necessary to 
draw tangents to the given graph and measure their slopes. It turns 
-out that it is practically simpler to draw normals to the graph by 
means of a shiny (metallic) ruler and to measure their slopes with 



respect to the y-axis which is just the same. One of these procedures 
is shown in Fig. 112. We apply the ruler perpendicularly to the 
plane of the graph to one of its points, e.g. to the point M, and 
turn the ruler in such a way that the reflection of the graph in the 
Tuler should prolong the graph without a break at M. In this posi- 
tion the ruler will lie exactly along the normal to the graph at M. 
Then we draw a straight line passing through the point (0; 1) and 
parallel to the normal thus constructed. In this way we get the line 
segment OP which is then transferred to the position qq. After a 
number of such procedures are carried out we obtain a rather accurate 
graph of the derivative. 

While discussing the geometrical meaning of the derivative 
[formula (4)] we supposed that both variables x and y were dimen- 
sionless and that the scale was the same for both axes. But this is 
not always the case in practical problems. It follows from formu- 
la (1. 10) that in the general case we must write y' = ~ tan a. 

Thus in the general case the derivative is also equal to the slope of 
the tangent. 

Note that if the derivative y' approaches infinity for some value 
of x (for x — Xi in Fig. 113) then the tangent at the corresponding 
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point of the graph has the slope equal to infinity, that is the tangent 
line is parallel to the y-axis. If the derivative has a jump discon- 
tinuity at a point then the tangent turns jumpwise, i.e. the graph 
is broken at the point (see the point x = x 2 in Fig. 113). In case 
the function approaches infinity the derivative may also turn into 
infinity (see the point x — x 3 in Fig. 113). 

4. Basic Properties of Derivatives. 

1. The derivative of a constant equals zero. (The property is ob- 
viously interpreted as the fact that the velocity of a body in a state 
of rest is equal to zero.) The formal proof of the property looks as 
follows: if y — C = const then 

Ai/ = C — £7 = 0, -^- = 0, 

y' = lim -^~= lim 0 = 0 

A.i'-vO A.t->0 

2. The derivative of a sum is equal to the sum of the derivatives of 
the summands. Indeed, if y (x) = u ( x ) + v {x) then y {x -}- Ax) — 
— u (x 4 Ax) + v (x -j- Ax) and 

Ay — y (x + Ax) — y (x) — [u (x + Ax) -f- v (x -f Ax)] — 

— [u (x) -f v (x)] = [u (x) + Au -f- v (x) -f- Ay] — 

— [u. (x) 4- v (x)] = Aia 4- Ay 

that is the increment of a sum is equal to the sum of the increments : 
A (u 4 y) = Au 4- Ay. Hence, it follows that 


: lim lim^-t^: 

A x~+ 0 Ax A.T-.0 Ax 


- J.JUU 

Ax-40 


(~A 

, Av 

\ Ax ! 

'"Ax' 


(in the deduction we have used the fact that the limit of a sum is 
equal to the sum of the limits) which is the required proof. This 
property can be rewritten in a different way as 


(u + v) r = u' 4- v' 


V\e have taken a sum of two summands. It is clear that the same 
is true for an arbitrary number of summands. Similarly, the incre- 
ment of a difference is equal to the difference of the increments and 
he derivative of a difference is equal to the difference of the deri- 
vatives. 


Example, (x 3 - 3x 2 + * + 5)' - (x 3 )' - (3x 2 )' + (x)' + (5)' = 
3x — Gx -j- 1 0 = 3x 2 — 6x -f 1 (see the end of Sec.- 2). 
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3. A constant factor can be taken outside the derivative sign, i.e. 

( Cu )' = Cu' (where C = const). Virtually, if y — Cu then 
Ay = y (x -j- Ax) — y (z) — Cu ( x -f- Az) — Cu (x) — 

— C [u (x + Az) — u (z)] = C A u 

in other words, if a function is multiplied by a constant its increment 
is multiplied by the same constant: A ( Cu ) — C A u. 

Hence. 

y' = lim lim C Au - = C lim ^A — Cu’ 

A*-*0 Ax Az-0 Ax Ax-»0 Ax 


4. Formula for the Derivative of a Product. Let y = uv. Then 
Ay = (u + An) ( v -f An) — uv = (An) u -f- u Av -f An An which 
implies 


= lim = lim v -f- lim 

Ax-vO Ax Ax-*0 Ai Ax-»0 


u Ai> 
Ax 


lim 


Au Av 
Ax 


A u 

= lim -T—v- 

Ax-+Q Ax 


• 15m: 


Aw 


Thus, 

(un)' = u'n -f uv' (6) 

For example, 

[(3z 2 -f oz) (4z 2 — 6)]' — (3 z 2 ~ 5z)' (4z 2 — 6) + 

+ (3Z 2 + 5z) (4z 2 — 6)' = (6z -f- 5) (4z 2 — 6) -f- 
(3z 2 -J- 5z) 8z = 48Z 3 -j- 60z 2 — 36z — 30 


From formula (6) we can easily deduce the formula for the deri- 
vative of a product of several factors. For example, 

( uvw )' = [(un) wY = (un)' w -j- (un) zn' = 

= (u'n 4- un') w -f- uvw' — u'vw -j- uv'w -J- uvw' 


The formula for the derivative of a product of an arbitrary number 
of factors looks quite similar. We note that property 3 can be easily 
deduced from formula (6) by putting v = C. 

5. Formula for the Derivative of a Quotient. Let y — ~ . Then 

, u-pAu u (Au) v — u'&v 

" y-|-Ay v v(u-}- Av) 

From this, representing An which enters into the denominator in 

the form Az, we obtain 
Ax 

Au Av 

y'= lim -^ = lim * Fl, ~ B ** ~ ~T~r~r ~? i\ 

A^O Ax n(y-fy-O) 
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Thus, 

/ u \ ' u'v — uv' 

\1T) = V* 

For example, 

/ 5x2 X ' (5x2) » (3x2 -j-4) — (5x2) (3*2 + 4) > _ 

\ 3x2 + 4 J ~ (3x2 +4)2 

lQx (3x2 + 4)- 5x2. 6x _ 4 Ox 

— (3x2 +4)2 (3x2 + 4)2 


6. The Derivative of a Composite Function. Let y = / (u), u = 
= q, ( x ) and let y be regarded as a composite function of x. If x 
receives an increment Ax then the intermediate variable u receives 
an increment A u and therefore y receives an increment Ay too. 
We have 

A y ___ Ay A u * /g\ 

Ax A u Ax ' ' 


Now let Ax 0. Then — ->■ u' x and hence Au — ~r— Ax ->■ u x - 0 = 

Ax Ax 

= 0. Therefore ■— -> y’ u . Passing to the limit in formula (8) we 


obtain 

y'x = y'uU’x (9) 


The last formula may be rewritten as 

1/ (<P (*))]' = /' (q> (*)) q>' (x) (10) 

In the case when a composite function is formed by means of 
a greater number of intermediate stages the derivative is computed 
in the same way. Thus, if y = y (u), u — u (u) and v = v {x) then 

Hx = yu’U'v‘V x . 

For instance, let y — (x 2 — 5a: + 3) 3 . Then we can denote y = u 3 
where u = x 2 — 5x -f 3, and by formula (9) we get 

y’x = y'u-u’x = (u%-(z 2 - 5x + 3); = 3u 2 (2x - 5) = 

- 3 (x 2 - 5x + 3) 2 (2x - 5) 

which is, of course, simpler than removing the brackets! In practical 
computations there is no need to write down all this in such a detailed 
manner. For instance, the former calculations can be put down as 
follows: 


U* 2 - 5* + 3) 3 ]' = 3 (x 2 - 5x + 3) 2 (x 2 -5x + 3)' = 
= 3 (x 2 - 5x -f 3) 2 (2x - 5) 


- *, F °nnula (8) is valid, of course, only if A u + 0. But in case Au = 0 it 

is also easy to prove formula (10).— Tr. 
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We use here formula (10), of course. After some practice one can 
write the result immediately without intermediate transformations. 
For this purpose it is advisable to remember the formulas for the 
derivatives in the form (u 2 )' = 2 uu' , (u?)' = 3 u 2 u! and the like 
(the derivatives are taken with respect to x). 

7. The Derivative of an Inverse Function. Suppose the equality 
y = y (x) defines the inverse relation x = x (y) (see Sec. 1.21) for 
which we can determine the derivative x' v . Then it is easy to compute 

the derivative of the original function y (x). Indeed, ~ — which 

Ax Ax 

implies, as Ai-> 0 and Ay 0, 

y’^~r ( 11 ) 

For example, let y = p^ x which yields x = y 3 . Then 


1 1 _ 1 _ 1 
*y _ (s i% — 3j/2~ 3 f/^8 


8. The Derivative of an Implicit Function. If a function is deter- 
mined in an implicit form F (x, y) = 0 (see Sec. 1.20) then to com- 
pute the derivative y’ x one should simply equate the derivatives of 
the left-hand side and of the right-hand side of the latter relation 
taking into account that y is_a function of x which turns the relation 
into an identity. Generally, it is permissible to equate the deriva- 
tives of both sides of an equality if and only if the equality is an 
identity (but not an equation!). 

For instance, let us take 



( 12 ) 


Then 




i.e. 


2x 


2 yy’ 


62 


(13) 


Computing the derivative of the second summand we have used 
property 6: (-fy)^ -p- (y z )v-y'x = -p - 2yy'. Thus, (13) implies 


y = 


cfly 


(14) 


5. Derivatives of Basic Elementary Functions. 

1. The Derivative of Sine. Let y = sin x. If the argument changes 
and becomes equal to x -J- Ax then the function becomes equal to 
sin (x + Ax). This implies 

Ay = sin (x -j- Aa:) — sin x = 2 sin 4^-* cos ; 
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Ay 


y' = lim = lim 


j^2 sin * cos 


Ax 

T 


Mr)] 


sm 


Aa; 


= lim ■ 

Ak -^0 


Ax 

Ax 


A x 


lim cos (x-\-—*rA = 1 -cos a; 

\A-_An ' " ' 


[see formula (III. 11)]. Hence, 

(sin x)' = cos x (15) 

2. We leave it to the reader to verify in an analogous way that 

(cos x)' = —sin x (16) 

3. The derivative of tangent is calculated by formula (7): 

.. ,, /sinaA' (sin xY cos x — sin x (cos x)' 

< tanx > “te) “ s; = 

cosavcosa: — sin x ( — sin a:) 1 

~ COS 2 X 

1 


COS 2 X cos 2 X 

4. Similarly, we can verify that (cot x)' = 


sin 2 x ‘ 

5. The Derivative of Arc Sine. Take y — arc sin x. Then x ~ 
— sin y and, hy formula (11), 

, i__ 1 _ _jl 1 _ 1 

Vx ~ x 'y ~ (sin !/)[, ~ cosy — ± Vl — sin 2 y ~ 

We have written -f in front of the radical sign because the values 
of arc sin x, as is well known, are taken in the interval — -tt-sC 

Ct 

arc sin x ^ which corresponds to non-negative values of 
cos y >- 0. Thus, 

(arc sin x)' — 


Vi-i 


6. We verify similarly that 

, (arc cos a;)' = 


(17) 


Vl — a : 2 

using the inequality 0 ^ arc cos x ^ n. The resemblance between 
the last two results is explained by the formula 


arc sin x -f- arc cos x = ~ 

which can be deduced in the following wayi if we denote sin a 
then 


(18) 
= x 
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and these formulas yield a — arc sin x and ~ — a — arc cos x. 

The addition of the last results implies (18). 

7. The Derivative of Arc Tangent. Let y — arc tan x. Then x = 
= tan y. Using the formulas for the derivatives of an inverse func- 
tion and of the tangent we obtain 

1 1 1 1 
Vx x' y 1 y ~ 1+tan 2 y ~ 1 + * 2 

cos 2 y 

Thus, 

(arc tan x)’ — 

8. The Derivative of a Logarithmic Function. Take t/ = lna:. 
Then putting k = -^-m formula (III. 12) we see that 

y'= lim lim InU+M-lnx ^ lim j? ( 1+ ~) = 1 

Axm-0 A x Ax->0 A x x z 

x 

Therefore 

(In x)' = — 


Applying formula (1.14) and taking into account that In a = const 
we receive 

*)' = (~) ' = ^7 (In . *)' = 7 ^ 7 

9. 77m Derivative of an Exponential Function. If y = a x then 


x = log„ y and 


Hence, 


1 1 

y' x = —r = — j— =y In a = a* In a 
x v _ 1— 

?lna 

(a 1 )' = a* In a 


In particular (e*)' = e*. 

10. The Derivative of a Power Function. According to the formula 
of the derivative of a composite function we have 


(x n Y = [(e lD *)"]' = (e n ln *)' = e" ln x n — = x n n— = nx n ~ l 


Thus, 


(x n Y = nx n_1 
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This formula holds for any n, both integral and non-integral. 
For example. (l^)' = = ’ ("*) “ ^ ^ = 

A 

= — lx -1-1 = r and the like. 

11. The Derivatives of Hyperbolic Functions. We have 
e x —e~ x \ ' 


(sinli x)' — ( 
Similarly, 


_ j - e *~ e ~2 ( — — = — cosh x 


(cosh x )' — sinh a:*, 

. . ,, /sinhxV' (sinh x)' cosh x — sinh x (cosh x)' 

(tanh x) =( c -^) = 

cosh 2 x — sinh 2 x _ 1 _ 

cosh 2 a: cosh 2 x ’ 

(sinh -1 x)' — [In {x + Y x 2 -f l)] / = 

L=r (X + V^+l) ' = i_ ( 1 + - ( 4 t£ =-) 

2+y*’- + l .r-Fl/^+l V 2T/* 2 + l / 

1 ~\/ x- -f- 1 -f x 


_ 1 ( l i 2 * ) _ 

ar-i-yjs + iV * 2 ~\/x“-\- 1' 


1 


z-fV^+l Vz 2 +1 V x 2 + 1 


These formulas also demonstrate a rather close analogy between 
trigonometric and hyperbolic functions. 

12. The above formulas (comprising the table of basic differentia- 
tion* formulas) should be learnt by heart since they will be per- 
manently used in what follows. With the help of the formulas it 
is possible to compute the derivative of any elementary function by 
using the rules of Sec. 4. For example, 


But 


Q/x 2 ,an 5 T = (Yx)‘ 2 ,an r ” + ,K.; (2>“ “)■ 


(/*)'- (*V— M‘‘=:r4 


3 )7 ,r 2 


and, by the formula of the derivative of a composite function, 
we obtain 


(2 tan 5 y = 2 tan 5* j fl 3 (fan 5 x y = gtan 6 * ^ 

* cos 2 5:r v ' 


)tan 5.v 


_ 2 tan 5* ^ p 1 


cos 2 5x 


diff™tht7on a £ n Se C . fin 8 d )!-rr! e d " ivative oI a function is dually called 

10—0141 
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Taking the common factor outside the brackets we derive 


{^x2 tan5 *)' : 


tan bx , 

F ?( 1 + 


9 tan bx 

3 


15 In 2 


s 2 ox / 


After some practice calculations of this type can he carried out much 
faster without intermediate transformations. 

13. In some cases it is useful to take logarithms before calcula- 
ting a derivative. For example, let it be necessary to find the deri- 
vative (x sln *)'. Then we write y = x sinx ] In y = sin x In x and 
(In y)' x = (sin x In x)' . Therefore, 

1 , ... 1 
— y =cosi]ni T snu: — 

y a x 

To calculate the left-hand side we have applied the formula of the 
derivative of a composite function. Finally, from the last relation, 

y' =z (X s111 *)' = 2 : sina: ^cosa; In a; -|- sin a:— j 

This method is sometimes used when it is necessary to find the 
derivative of the product of several factors since after taking the 
logarithm the product turns into a sum, and, generally speaking, 
it is easier to find the derivative of the sum than that of the product. 

6. Determining Tangent in Polar Coordinates. The problem of 
determining the tangent to a curve which is represented by its equa- 
tion in Cartesian coordinates was solved in Sec. 3. Now let a curve 
be given by its equation p = / (<p) in polar coordinates. To determine 
the tangent we could transform the equation into Cartesian coordi- 
nates but it is simpler to solve the problem directly in polar coor- 
dinates. Let the position of the tangent be determined by the angle 0 
(see Fig. 114). Let us give <p a small increment At p and let us con- 
sider the infinitesimal curvilinear triangle MNP formed by the 
two coordinate lines and the graph of p = / (<p). For the sake of 
convenience the triangle is depicted in Fig. 114 separately. The 
triangle may be regarded as a “genuine” triangle (i.e. as a rectilinear 
triangle) to within to infinitesimals of higher order and even 
as a rectangular triangle since /_N = 90°. (Why is it so?) This 
implies 

cot0 = lim cot 0* = lim -^- = — p' mq\ 

Atp-0 a<p-o P A( P P 9 

Example 1. For the logarithmic spiral (see Sec. II.5) we have 

"‘ y e of a co~ l fj h IT) \ i . 

nT . „ 0 = — (ke h v) = k = const 

Thus, the spiral intersects all the coordinate rays forming one and 
the same angle with them. It is easy to find the relationship between 
this property and the property indicated in Sec. II. 5: they turn 
out to be equivalent. 
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Example 2. Let us consider the polar equation of a parabola with 
respect to its focus [i.e. equation (11.29) in which we should put 



Fig. 114 



e = 1]. By formula (19) we have 
cot 0 = 1 + CQS( P P sin <P_ 


sin (p 


„ . (p qp 
2 sin cos ~ 


(l + cos(p) 2 l-fcostp 
= tan = cot (-f— f) ; 


2 cos 


.2 <P 



jP 

2 


Therefore if we draw a straight line parallel to the polar axis and 
passing through the point M we shall have (see Fig. 115) a + 0 = 
’= it — qp, that is 


a = n - T -e = n; -<p_(^_i)„A_A =0 = |5 

From this we obtain the basic optical property of a parabola : if the 
light propagates in the plane of the parabola from a light source 
placed in its focus then all the rays reflected by the parabola are 



Fig. lie 


10 * 
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parallel to its axis. Here lies the explanation of the fact that the 
reflector of a projector is shaped in the form of a surface generated 
by the revolution of a parabola about its axis (the parabolic reflector). 

It is a little more complicated to deduce analogous optical pro- 
perties of an ellipse and of a hyperbola. For instance, one can show 
that all the rays issued from a focus of an ellipse gather at the other 
focus after being reflected from the ellipse (see Fig. 116). A parabola 
may be regarded as an ellipse with one of its focuses removed to 
infinity (see the end of Sec. 11.12) and therefore the optical property 
of a parabola is implied by that of an ellipse if we pass to the limit. 


§ 2. Differential 

7. Physical Examples. The notion of a differential is closely 
related to that of a derivative and is also one of the most important 
notions in mathematics. We shall illustrate it by considering the 
same examples as in Sec. 1. 

Let a point have the velocity v = s', = f ( t ) at a moment t in 
its rectilinear motion according to the law of motion s = / (t). 
If now some additional time A t passes the point will cover some 
additional distance As. In case the motion is non-uniform the depen- 
dence of As on At can be complicated because the velocity of motion 
varies all the time. But if At (the time passed) is not large the velocity 
has no time to change considerably during the period of time from 
t to t + At. Therefore the motion may be regarded as “almost uni- 
form” during this period. Hence, reckoning the distance we shall 
not get a serious error if sve regard the motion as uniform, i.e. as 
having a constant velocity, namely, the velocity it had at the 
moment t. 

Thus we obtain the distance v At = s' t At. The distance is directly 
proportional to the time passed At. s{ At is called the differential 
•of the distance and is denoted as ds: ds = s' t At (the symbol ds should 
be understood as an indivisible symbol and not as the product of d 
by s). The real distance As differs from the “invented” distance ds. 
of course, since the velocity may change during the period At no 
matter how small At is. But nevertheless if this period is small 
enough w r e can put approximately 

As « ds (20) 

But the smaller At, the smaller the change of the velocity. There- 
fore the accuracy of formula (20) becomes greater as At is decreased. 
In Sec. 8 we shall show that when the interval At is infinitesimal 
the difference between As and ds is an infinitesimal variable of higher 
order relative to As. There are many situations w T hen it is permis- 
sible to neglect such infinitesimals of higher order. Then it is pos- 
sible to say that the differential of the distance is nothing but an 
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infinitesimal distance, i.e. the distance corresponding to an infi- 
nitesimal interval of time. At the same time, of course, the diffe- 
rential of the distance may not be an infinitesimal at all in case 
M is not small, but the greater A t, the lower the degree of accuracy 
of formula (20). Nevertheless, it is much easier to compute ds as 
a distance passed in a uniform motion than to evaluate the real 
distance As; this accounts foT the fact that formula (20) is often 
used even when At is not very small. 

Turning to the second example and reasoning in the same way 
we can say that the differential of the volume dV represents the 
volume which would he filled if the rate of filling remained constant 
and equal to the rate at the moment t during the period of time 
from t to t + At, that is dV = V',At. Similarly, the differential of 
the mass in the third example is the mass which the part AB of the 
curve (see Fig. 110) would have if the density of this part were con- 
stant and equal to the density at the point A , that is dM — p As — 

- M’ s As. 

In all cases the replacement of a real change of a quantity by its 
differential reduces to the transition from some non-uniform pro- 
cesses, non-homogeneous objects, etc. to the uniform and homo- 
geneous ones. Such a replacement is always based upon the fact 
that every process is “almost uniform” during a small interval of 
time and every object is “almost homogeneous” in the small and 
so on. 

8. Definition of Differential and Its Connection with Increment. 
Now we shall give the general definition of a differential. Let the 
argument of a function y = f {x) first take a value x and then receive 
an increment Ax. Then the differential of the function is the product 

dy = df ( x ) — y’ Ax = f {x) Ax (21) 

The differential is therefore the increment which the function would 
receive if it changed in the interval from x to x -j- Ax with the same 
velocity as for the value x of the argument. 

The operation of finding the differential of a function is called 
differentiation of the function; it is carried out quite simply by means 
of formula (21). For example, let y = sin x; then dy = (sin x)' Ax — 

— cos x Ax, i.e. d sin x — cos x Ax. 

Similarly, d tan x = — - 2 ^ A.t, d ( x 3 ) = 3x 2 Ax and the like. 

Thus, when differentiating a, function one must find its derivative 
and multiply the result by Az; therefore the operation of computing 
the derivative is also often called differentiation. But one must 
take care not to confuse the derivative with the differential. The 

erivative of a function y = / (x) depends only on x whereas the 
cunerential also depends on Ax. In practical applications the diffe- 
rential is usually regarded as an infinitesimal whereas the derivative 
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is understood as a finite value. In case variables x and y have certain 
dimensions 

We note, in particular, that 

dx = x' x Ax — 1 Ax = Ax 

which means that the differential of an independent variable is equal 
to its increment. This makes it possible to rewrite formula (21): 

dy = /' (x) dx = i /’ dx (22) 

and, on the other hand, to represent the derivative as the ratio of 
the differentials: 

y'x — ^- or, which is the same, /' (x) = 


The geometrical meaning of the differential of a function is shown 
in Fig. 117: the differential is equal to the increment of the ordinate 
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of the tangent. Hence, the replacement of the increment of a function 
by its differential is equivalent to the replacement of the graph of 
the function by the segment of the tangent drawn through the point 
A. This replacement is justified in case Ax is small enough. 

To investigate the connection between the differential and the 

increment we take into account that >•«', i.e. ^- = y' 4- a 

where a -> 0 as Ax 0. This yields 

Ay — y' Ax + a Ax = dy + ji (23) 

where p = a Ax is an infinitesimal variable of higher order than 
Ax (it is represented by the segment CD in Fig. 117). The equality 
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(93) can be formulated as the differential is the principal linear part 
of the increment of a function. It is called here the principal part 
since the difference between the differential and the increment is 
the infinitesimal (3 of higher order and it is called the linear part 
since it is directly proportional to Ax (compare with Sec. 1.22). 
jf y' o then dy and dx = Ax are infinitesimals of the same order 
ancl therefore p in formula (23) is an infinitesimal of higher order 
than dy, that js dy and Ay are equivalent infinitesimals (see Sec. 

Let us take an example to illustrate the error which occurs if the 
increment of a function is replaced by its differential. Let y — x 2 
and let the argument first assume the value x = 1 and then receive 
the increment Ax. We have 

Ay = (1 + Az) 2 - l 2 = 2 Ax + Ax 2 ; 
dy = y' Ax = 2 • 1 • Ax = 2 Ax 

Therefore, Ay and dy differ from each other by the infinitesimal 
(Ax) 2 of the second order (see Fig. 118). In particular, 

for Az= 0.1 Ay— 0.21; dy— 0.2; the error is 5 per 

cent; 

for Ax — 0.01 Ay— 0.0201; dy = 0.02; the error is 0,5 per 

cent; 

for Ax—— 0.001 Ay— — 0.001999; dy—— 0.002; the error is 0.05 per 

cent etc. 

It is obvious here that the relative error generated by the repla- 
cement of Ay by dy decreases rapidly as | Ax | decreases. 

A function which has the differential is called differentiable. 
In other words, a differentiable function is a function such that its 
small increment has the principal linear part, i.e. a function which 
may be approximately replaced by a linear function on every small 
interval of change of the argument (such a replacement is the so- 
called process of linearizing). A differentiable function must have 
a finite derivative, and the function itself must be continuous for 
the considered values of the argument since (23) shows that Ay 
is infinitesimal when Ax is infinitesimal. At the same time a con- 
tinuous function may turn out to be non-differentiable at some 
points. For instance, the function shown in Fig. 113 is non-differen- 
tiable not only at the point of discontinuity x — x 3 but also at the 
points x = x i and x = x z where it is continuous. B. Bolzano, a 
famous Czech mathematician (1781-1848), and, independently of 

urn, K. Weierstrass (1815-1897), a prominent German mathemati- 
cian discovered (the former in 1830 and the latter in 1860; Bolzano’s 
result was not published) the existence of continuous functions which 
are non-differentiable for all values of the argument. Such functions 
had been considered a mathematical trick for a long time but it 
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turned out that they were of essential importance for describing 
processes of the type of a Brownian motion. We shall not pay atten- 
tion to the possibility of existence of such “monsters” in our intro- 
ductory course. 

9. Properties of Differential. The differential of a function is 
obtained by multiplying its derivative by the differential of the 
argument and therefore each property of the derivative (see Sec. 4) 
obviously implies the corresponding property of the differential. 
For example, multiplying both parts of the equality (u + v)' — 
— u' + v' by dx we receive (u -j- v)' dx = u' dx + v' dx or, which 
is the same, 

d (u 4- v) — du + dv 


(i.e. the differential of a sum is equal to the sum of the differentials). 
Similarly, we deduce the formula 

d (uv) — {du) v u dv (24) 

and the like. We shall see in Sec. IX. 12 that these formulas also 
hold for the case of an arbitrary number of independent variables. 

The implication of the formula for the derivative of a composite 
function is of special importance. Let y = f (x) and let x first be 
an independent variable. Then each of the formulas (21) and (22) 
can be used for calculating dy since in this case Ax — dx. Now let x 
depend on a third variable, for example, x = x ( t ). Then Ax 4 = dx 
but it turns out that nevertheless formula (22) remains true [whereas 
formula (21) does not hold, in general]. Virtually, 

dy — y' t dt = y'^x't dt — y' x dx 


which is what we set out to prove. Therefore it is natural to use 
formula (22) [and not (21)] for calculating the differential since 
this formula remains true (invariant) in all cases. 

Now we shall apply this invariance property* to computing the 
derivative of a function represented parametrically (see Sec. II. 6). 

Let x — x (t) and y = y (t) ( t is a parameter). Then dx = x dt and 

dy — y dt (the dot usually denotes the derivative with respect to 
a parameter) which implies 



y_ 

X 


(25) 


All the properties of differentials are used, in particular, for 
linearizing relations between variables, that is for passing from 
a general, non-linear, relation to the linear relation between the 
increments of the variables. Such a linearization is possible in case 

* This property is usually called the invariance of the form of the diffe- 
rential.— Tr. 
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the changes of the variables are small, and it is based on dropping 
infinitesimals of higher order. 

Thus, for instance, equation (11.30) characterizes the non-linear 
relation between the coordinates of a point M ( x , xj) belonging 
to a curve of the second order. But now let the position of the point 
M change near a fixed point M 0 ( x 0 , y 0 ), that is let the increments 
x — x 0 — | and xj — y 0 — rj be small. Differentiating equation 
(11.30) and replacing the differentials by the increments we come 
to the linearized equation 

2Ax 0 l + 2By 0 l + 25:roii + 2Cy 0 r\ + D\ + Ex\ = 0 (26) 


which describes the approximate linear relation between | and ip 
Since when deducing equation (26) we replaced the differentials 
dx and dy by the increments | and ip the point M of the line satisfies 
the equation only with an accuracy of the infinitesimals of higher 
order. Equation (26) is precisely satisfied by the points of the tan- 
gent line to curve (11.30) drawn through the point M 0 . 

The linearization is widely used in physics, in particular, when 
differential equations are deduced (see Sec. XIV. 6). 

10. Application of Differentials to Approximate Calculations. 
Differentials are widely used for approximate calculations. First 
of all, the increment of a function is often replaced by its differen- 
tial which, as a rule, can be found in a simpler way. 

Suppose we are given a function y =/(*). Let a particular value 
/ (a) be known. Let the argument * receive a small increment A* — 
= h. Then we can put 

f {a + h) — f (a) = Ay « dy = f (a) h 

that is 

f (a + h) « / (a) + f (a) h (27) 

Choosing the concrete functions ^ x, sin x, In x and so forth as 
/ (*) we derive the approximate formulas 


sin (a -f h) « sin a +/& cos a 
In (a + h) In a + 


(28) 


etc. which are applicable for small | h j. The formulas can he speci- 
hed and their errors can he effectively estimated. This question 
will be discussed in Secs. 15-16. 

Let us consider an example. Suppose we know that In 2 = 0.693. 
then calculating with an accuracy of 0.001 we get 


In 2.1 = In (2 + 0.1) « In 2 + ^ = 0.693 + 0.050 =-0.743 
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The table of logarithms gives the value In 2.1 = 0.742, i.e. the 
error is smaller than 0.2 per cent. 

It is sometimes necessary to transform the expression that must 
be calculated in order to facilitate the calculations. For instance, 
it is wrong to calculate y 2 as 

y * = yi + 1 « yi +— =1+4=1 .333 


since the value h = 1 can hardly be regarded as small in comparison 


with a = 1. It is convenient to put here y 2 = ///. an( j to choose 

r m 


the integer m so that 2m 3 should become as close as possible to an 
exact cube of an integer. It is possible to take m = 4 since 2-4 3 = 
= 128 is close to 125 = 5 3 ; then we get 


1/ 2—/\/ 2 - 4 3 = 4 y 128 — /y 125 + 3 « 

* t / m +wm ) = 4 ( 5 + k) - r x 5Mm = 1 - 2600 


Tables of roots yield the value y 2 = 1.2599, i.e. the error is smaller 
than 0.01 per cent. 

Differentials are also used for estimating errors. Suppose, the vari- 
ables x and y are connected by a functional relation y — f (x), and 
let the approximate value x of the argument x be known with the 
maximum absolute error a x (see Sec. 1.7). Then, of course, y = / (x) 
should be taken as an approximate value of y. To estimate the 
maximum absolute error a y we observe that x = x + li where 
| h 1 <C a s and therefore, if a x (and, consequently, h) is small, then 


F = F + Ay « y + dy = y + f (x)h 

that is 


1 y — y I « I /' (x) I • I h I < I /' (x) I a* 


Thus, we can put 


oc y = | /' (x) | a x 

For example, let y — x n . Then 

a v = | nl r n_1 1 a x 


(29) 


and the corresponding maximum relative errors are connected by 
the simple formula 

s _ a v _ 1 nxn-1 1 _ \n\a x _ . , » 
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As another example let us consider In 10.7 where the 
value 10.7 is approximate and is known with an accuracy of 0.1. 
We have In 10.7 = 2.3702 according to the table but it is obvious 
that this result contains too many decimal digits. To understand 
what the accuracy of the result is we must take into account that 
in our case a x = 0.1 which implies 

a y — iff/7 0-1 ~ °- 01 

that, is the result should be put down in the form In 10.7 = 2.37. 

§ 3. Derivatives and Differentials of Higher Orders 

11. Derivatives of Higher Orders. Let y — f ( x ). Then the deriva- 
tive if = f (; x ) which was studied in § 1 is called the derivative 
of the first order or the first derivative of the function / ( x ). In its 
turn, /' (x) is also a function of x and therefore it is possible to take 
its derivative which is called the derivative of the second order 
or the second derivative of the original function: 

v" = (y'Y = r (*) 

In the same way we define the derivative of the third order (the 
third derivative): 


r = WY = r & 

The consequent derivatives are denoted as yG) = y IV , z/G) = y v 
etc. For example, (a; 3 )' = 3ar, (x 3 )" = (3x 2 )' = 6x; (sin x)' — cos x, 
(sin x)" = (cos x)' = — sin x and the like. The derivative of the 
second order sometimes has a clear physical meaning: thus, in the 
first example of Sec. 1 the derivative of the second order of the dis- 
tance with respect to the time is the velocity of change of the in- 
stantaneous velocity, that is the instantaneous acceleration. We 
shall discuss the applications of the derivatives of higher orders 
in Sec. 15 and further. 

The formula for the derivative of a sum is quite simple. If y = 
= u + v then if = u' + v', y" = (u’ + v'f = u" + v" and so 
on. Generally, 

(u + v) (n> = u tw + u< n) 

As for the formula for the derivative of a product, we have 
(uvY = u'v + UV r , 

M" = (u'v + uv'Y = u"v + u'v' -I- u'v' + uv" = 

= u v -j- 2 u'v' -j- uv", 

(uv) m = (u"v -{- 2 u'v' + uv")' = u"’v + 2 u"v' + u'v" + uV + 

+ 2u ' v " + uv"’ = u"’v + 3 mV + 3 u'v" + uv m etc. (30) 
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Here computing the next derivative we begin with differentiating 
the first factors in all the terms and after that we differentiate the 
second factors in all the terms. These computations are carried out 
according to the scheme resembling that of expanding the expres- 
sions ( a + b ) 2 , (a -f- 6) 3 and so forth: 


(a + b)~ = (a + b) (a -j- b) = a 2 + ab + ab b 2 = 

= a 2 + 2 ab + b 2 , 

(a + b) 3 = (a 2 -}- 2 ab + b 2 ) (a + b) = a 3 + 2a 2 h + ab 2 -f- 

-f- a 2 b 2nd 2 4" b 3 = a 3 -}- 3 a 2 b -f- 3 ab 2 + b 3 etc. (31) 


Therefore the coefficients in formulas (30) are the same as in formu- 
las (31). In the general case the formula (the Leibniz rule) can be 
written as 

(uv) (n) = u< n, u 4- ( " ) id"' 1 ’ v' 4- ( n 2 ) w< n - 2 > v" + uv™ (32) 


where ( ” ) > ( 2 ) ’ • • • are so-called binomial coefficients, i.e. 

the coefficients occurring in the binomial formula [the expansion 
of the power (a -f b) n ]. 

The calculation of the derivative of an implicit function will be 
illustrated by taking equation (12). Further differentiation of equali- 
ty (13) yields 


ip+-tk(y' i /+ yy") = o, i.e. /= -A- = 


62 

(JL . 

_ _ 

64 

[ y 2 

y 

\ o’- 

^ aW ) 

aW 

\ 62 


64 


a*y 3 


(33) 


[here we have used expression (14) for y']. The following derivatives 
are computed in a similar way. Formula (33) can also be obtained 
by differentiating equality (14). 

In addition, we shall turn to differentiating a function represen- 
ted by its parametric equations. Differentiating formula (25) we get 


*£=£*: * 

« __ d(y') _ \ X / _ _ y x — yx 

dx dx X dt is 

(the two dots denote the second derivative with respect to a para- 
meter). Here, as in deducing formula (25), we need not pay atten- 
tion to the fact which of the variables is independent when we calcu- 
late the differential (the first differential, as we shall call it in Sec. 12). 
The derivatives of higher order can be found similarly. 

12. Higher-Order Differentials. Let y = f (x). Then 

dy = /' (.t) dx 


(34) 
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The differential calculated by formula (34) is called the first diffe- 
rential (the first-order differential). The differential of the second 
order (the second-order differential) of a function is the first diffe- 
rential of the first differential of the function. It is denoted by dry. 
If x is an independent variable then in the process of the repeated 
differentiation the quantity dx is regarded as being independent 
of x. Thus, dx being a constant, we take it outside the differentiation 
sign: 

d-y = d ( dy ) = d {f ( x ) dx) = d (f (x)) dx = 

= (/' (x))' dx dx = f" (x) dx- (35) 


where the notation dx- — (dx)- is assumed. Similarly, we obtain 
d?y = d (dry) = f (a:) dx? (36) 


and so on. This enables us to write the derivatives of higher order 
as the ratios of the corresponding differentials: 


n 


V 


&y_ 
dx n - ' 


y 


d 3 y 

dx 3 


etc. 


(37) 


In addition, we see that if dy is an infinitesimal of the first order 
relative to dx then d?y is an infinitesimal of the second order, d?y 
is an infinitesimal of the third order and so on. Further, we note that 

d?x = d (dx) = d (i-dx) = dx d (1) = 0 

that is the second-order differential of an independent variable is equal 
to zero ; of course, all the following, differentials of an independent 
variable are also equal to zero. 

In case x is not an independent variable (or we do not know whether 
it is independent or not) formula (34), as it was seen in Sec. 8, is 
nevertheless true. But using this formula for subsequent differentia- 
tions we must not consider dx constant but should use the rule of 
differentiating a product [see formula (24)1: 


dry = d (/' (x) dx) = d (f (x)) • dx + f (x) d (dx) — 

= f (x) dx 2 -f f (x) drx ' (38) 

In a similar way we find 

d?y = f (x) dx? + 3/" (x) dx d 2 x + f (x) d?x (39) 

(check it!) and all the following differentials. If now it turns out 
that x is an independent variable then d 2 x — d?x = 0, and formula 
(38) turns into formula (35) and formula (39) turns into formula 
(36). Thus, formulas (35)-(37) should be used only in case x is an inde- 
pendent variable. 

The notion of the derivative and that of the differential and all 
the basic rules of operating on them were elaborated by Newton 
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(1666) and by Leibniz (1684) although these notions had been used 
for some particular problems before. 

Differential calculus has very many applications in the field of 
investigating the behaviour of functions. These applications will 
he considered in the forthcoming sections of our course. 


§ 4. L' Hospital' s Rule 

13. Indeterminate Forms of the Type -jj- . We said in Sec. III. 7 that 

the evaluation of the limit of the ratio of two infinitesimals might 
yield different results in different cases. J. Bernoulli discovered a 
simple rule for evaluating such a limit -which is applicable to many 
cases. This rule was published in 1696 in the first printed textbook 
on differential calculus written by L’Hospital, a French mathema- 
tician (1661-1704). Let it be necessary to evaluate the limit 


lim IS- (40) 

i-*t 0 1 M') 

and 

<P (^o) = Uo) = 0 (41) 

that is let us have an indeterminate form of the type . Suppose 
that the limit (finite or infinite) 


f-+i 0 $ "/ 


(42) 


is found in some way. Then we assert that the limit (40) is also equal 
to /f, that is we have 


lim 

to 


<P (0 
T(0 


= lim 
<-►*0 


<p' (0 

V to 


(43) 


for indeterminate forms of the type —■ . 

In order to prove the rule let us consider the curve y = cp (<), 
x = tJj (/) in the x, y-plane. Then, as t -*■ t 0 , by (41), this curve 
approaches the origin of the coordinate system. To find out in what 
way it approaches the origin (i.e. like a spiral or along a certain 
direction which then should be specified etc.) we note that, by (42), 
we have 


dy _ q>' (t) dt <p' [t) , 

dx ip' ( t ) dt if' (t) 


(rs l — > Iq) 


Hence (see Fig. 119), while the curve approaches the origin the 
tangent to the curve turns in this process and tends to the limiting 
position in which it forms an angle a 0 with the z-axis such that 
tan a 0 = k. But then the angle [5 (“the angle of elevation”) also 
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tends to a o> i- e - 

<p(0 

Tj)(0 


X =tan B vtan a 0 = k 

X t -+ t 0 


( 44 ) 


which is just what we set out to prove. 

It sometimes happens that using L Hospital s rule we obtain 
a ratio of derivatives which is again an indeterminate form of the 

type Jj. • then it is possible to apply the rule again and so forth. 
For example, 


lim — 

x ->0 x 


sin x 


(0\ ,. 1— cos* (0\ i- sin* /0\ 

( o) “Sf-ET— (o') “£S"TE“" ( 0 ) ~ 


Him 

a:-»0 


COS X 


i 

6 ’ 


lim 

ar->i 


2* — 4 X 2~ x 

(*-i ) 2 


/0\ .. 2 V In 2+4 X 2~ x In 2 4 In 2 

W -= =— “ ± 


oo 


We have applied L’Hospital’s rule three times in the first example 
and once in the second example. 

L’Hospital’s rule for indeterminate forms of type (40) always 
achieves the aim in case t 0 is a finite number and the numerator 



Fig. 119 Fig. 120 

and the denominator are infinitesimals of an integral order of small- 
ness relative to t — t 0 (see Sec. III. 10). Indeed, L’Hospital’s rule 
implies that each differentiation reduces the order of the infinitesi- 
mals by unity, and after several steps we shall obtain variables 
of the “zero order” (that is having a finite limit) in the numerator 
or in the denominator (or in both of them) and thus we shall no 
longer have an indeterminate form. 
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14. Indeterminate Forms of the Type — . L’Hospital’s rule (43) 


also remains true for indeterminate forms of the tvpe — , i.e. when 

X CO 

the condition 


I 9 (*o) I = I (to) I = 00 


is put instead of (41). 

The proof for this case is analogous to the one given in Sec. 13 
but now the curve y — ep (<), x = i|) ( t ) does not approach the origin 
of the coordinate system as t t 0 but travels to infinity (see Fig. 120). 
In this process the curve, by condition (42), turns in such a way 
that the angle a which the curve (i.e. its tangent) forms with the 
ar-axis tends to a 0 where tan a 0 = k. But then the distance passed 
by the point M of the curve along the straight line ll (see Fig. 120) 
will be an infinitely large variable of higher order than the distance 
along the straight line transversal to ll, namely, MM' OM. 
Hence, MOM ' -*• 0 as the point M travels to infinity and there- 
fore the “angle of elevation” p tends to a 0 and we can write the same 
formula (44) again which concludes the proof. 

We give here several important examples: 

1 


lim 

X->cO 


In a: 


= (— ) - lim — 

\ 00 I kx 


x-yca kx h ~^ 

lira 4-= (— )=lim 
a x \ oo I 

x-voo ' • X~*-OC 


(»>*)' 



(A>0, b> 1, a = yb)' 


Hence, a logarithmic function with a base greater than unity 
tends to infinity (as its argument tends to infinity) slower than any 
power function with a positive exponent. Besides, a power function 
tends to infinity slower than any exponential function -with a base 
greater than unity. 

L’Hospital’s rule can be applied to some indeterminate forms 
of other types (see Sec. III. 5, property 6 and the beginning of 

Sec. III. 15) after they are transformed into forms of the type ~ or 


— . This can be achieved according to the following scheme: 

1 1 0—0 0 
or* oo — — 

o 

These formulas should be understood in a conditional sense. We 


O’ oo 


= 0 ±=.° 
0 


0 0 0x0 0 

should be understood in a conditional sense, 
use them only to indicate the types of the variables. After taking 
logarithms we can also apply L’Hospital’s rule to power indetermi- 
nate forms. 
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§ 5. Taylor's Formula and Series 

15. Taylor’s Formula. It was shown in Sec. 10 that replacing 
the increment of a function by its differential we can deduce many 
approximate formulas. It turns out that these formulas can be made 
much more accurate if we apply differentials of higher order. This 
problem is solved by means of Taylor’s formula named after 
B. Taylor (1685-1731), an English mathematician. 

Let us first suppose that we are given a polynomial P (x). A poly- 
nomial is usually considered as expanded into powers of x but it 
is quite easy to expand it in powers of 2 — a where a is an arbitrary 
number. 

Suppose, for example, that we are going to expand the polynomial 
P (x) — 5 — 3x -f- 2x 3 in powers of x — 4. In order to do this it 

is sufficient to substitute x = [4 -f- {x — 4)] and then remove the 

square brackets without removing the parentheses: 

P (x) = 5 — 3 [4 + (x - 4)] + 2 [4 -f (* - 4)P = 

- 5 - 12 - 3 (x — 4) + 128 + 96 (x — 4) + 24 (x — 4)- + 

+ 2 (x - 4) 3 = 121 + 93 (x - 4) + 24 (x - 4) 2 + 2 (x - 4) 3 
In the general case, for a polynomial of degree n, we can write 

P (x) — a 0 + (x — a) -f «2 (# ~ «) 2 + 

+ u 3 (x — af + • • • + a n (x — a) n (45) 

The coefficients here can be found in the following way. First we 
put x = a and obtain P (a) — a 0 . Then we differentiate formu- 
la (45): 

P’ {x) =«j-{- 2 a 2 (x — a) + 3a 3 (x — a) 2 + . . . 

. . . -f na n (x — «) n-1 

If now we put here x = a this will yield P’ (a) — a { . Let us diffe- 
rentiate once again: 

P" (x) = 1 X 2u 2 + 2 X 3u s (x - a) + . . . 

. . . + (n — 1) na n {x — a)" -2 

which implies P" (a) = 1 2a„. Further, in a similar way we 

derive P r " (a) = 1 x 2 x 3a 3 and so on. Generally, P (h) (a) = 
=A'!o ft (where k\= X 1 2 . . . k ) which yields 

a h = P< h > ( a )/kl 

Thus, formula (45) can he rewritten in the following form: 

P(x) = P(«)+^ >-(*-«) +-5^2. (*-<>)=+... 


there 2 is the summation sign (see Sec. III. 6). 


1 1-0141 
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For example, taking tlie polynomial from the previous example 
we deduce 

P' (x) = — 3 + 6x z , P" (x) = 1 2x, P n (x) = 12, 

P(4) = 5-3x4 + 2x4 3 = 121, -3 + 6 x 4 2 = 93. 

P" (4) 12x4 _ 0/ P" (4) 12 0 

2! "" 2 3! — 6 

that is we obtain the same values of the coefficients as before. 

If now we take an arbitrary function f (x) in place of a polynomial 
P (x) then formula (46) will no longer hold. But if we denote the 
difference between the left-hand and the right-hand sides of formu- 
la (46) by R n (x) (the remainder) then we can write 

/ (x) = f (a) + (x - a) + (x - a) 2 + . . . 

'" + !^ {x _ a) n +Bn{x) (47) 


It is this formula that is called Taylor’s formula. The most essen- 
tial thing about the formula is that the remainder is an infinitesimal 
of at least {n -f- 1 )th order relative to x — a as x ->• a, that is R n ( x ) 
is an infinitesimal of higher order than the last of the “exact’' terms 
put down in formula (47). In order to prove this assertion let us 
suppose, for simplicity’s sake, that n = 2. that is 

/(*) = / (a) + LjSL(x- a) + (x - aT- + i? 2 (*) 


Finding R 2 {x) from this and applying L’ Hospital’s rule (see 
Sec. 13) we receive 


lim — lim 

x-ya ( x x-ya 


/(*) — /(«) — /' ( a ) (* — <*)- 


/» 

2! 


(z — a)2 


(z — ap 

_ / 0 \ _ V (*)-/' (a)-f"(a) (*-a ) _ ( ±\ _ 
“ V 0 ) ” 3 (z — a)- ~\0/“ 

= ]im r(x)-r j& = (0\ == lim r+uiiw 

3 X 2 (z — a) \o) ™ 3! 3! 


(48) 


i.e. the ratio has a finite limit as x a. This just implies 

(see Sec. III. 10) our assertion about the order of smallness of R 2 (x), 
and an analogous consideration holds for R n ( x ). 

Let us denote x = a + h. We see that dropping the remainder 
in formula (47) for successively increasing values of n we shall 
obtain approximate formulas with, respectively, increasing degrees 
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of accuracy (for small values of ] h |): 

f (a + h) & f ( a ) + f (a) h 

[this formula coincides with formula (27) and guarantees an accu- 
racy to a term of the order of h“], 

/ (a + h) « / (a) + f (a) h + h* (50) 


with an accuracy of the order of h a , 

f (a + h ). « / (a) + f (a) h + h 2 + ft 8 (51) 


with an accuracy of the order of h 4 and so forth. 

The polynomials (in h = x — a) entering into the right-hand 
sides are called Taylor’s polynomials. They yield, in some sense, 
the best approximate expression of a function f ( x ) in the form of 
a polynomial of a given degree near the value x — a. Namely, 
a Taylor polynomial diHers from f (x) by a term which is an infinite- 
simal of the highest order in comparison with all the polynomials 
of the same degree as x-*- a. For instance, even if we change only 
one of the coefficients on the right-hand side of (50) then the diffe- 
rence may become an infinitesimal of order 0, 1 or 2 but not an 
infinitesimal of the third order relative to x — a (as x a). 

16. Taylor’s Series. Since the errors of formulas (49), (50), (51) 
etc. are becoming infinitesimals of greater and still greater order 
(as n -*■ oo) it is quite natural to expect that for small | h | it is 
permissible to pass to the limit and thus obtain an “exact’" repre- 
sentation of / (a + h) in the form of a sum of an infinite series (see 
Sec. III. 6), that is 


f(a + h) = f(a) + !^-h+^ 




■ • > — / (a) + 2 

71=1 


n ! 


(52) 


This series is called Taylor’s series; it was originally introduced by 
B. Taylor in 1715. Such series will be systematically treated in 
§ 3 of Chapter XVII. It will be shown there that the above-mentioned 
supposition is true. By the way, we shall answer the question what 
particular values of h guarantee the possibility of using formula (52). 
It turns out that it is always permissible to apply the formula if 
the series is practically convergent in the sense described in the 
end of Sec. III. 6 [but in this case the function which should be 
expanded into Taylor’s series must not be represented by different 
formulas on different parts of the range of its argument (see Sec. 1.13)]. 
Taking this comment into account we shall now use Taylor’s series. 

11 * 
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Formula (52) can be rewritten (by substituting a + h — x and 
h = x — a) in the form 

= + ... (53) 

which is an expansion into powers of x — a. In particular, when 
a = 0 we obtain an expansion in powers of x: 

f(x) = f( 0) + Lj® x+ ™La*+... (54) 

Series (54) is sometimes called Maclaurin’s series after C. Maclaurin 
(1698-1746), a Scotch mathematician, which is historically incorrect. 

For example, let / ( x ) = e x . Then /' ( x ) — e x , /" ( x ) — e x , . . . 
and 

/ ( 0 ) = i, r ( 0 ) = i, r( o) = i, ... 

Hence formula (54) results here in 

e x -i-\~x + -^-x 2 + -^-x 3 -]- ... (55) 

Let us evaluate the number e with an accuracy of 0.001. In order 
to do this let us put x — 1 and evaluate the terms one after another 
retaining a reserve decimal digit: 

e = 1.0000 + 1.0000 + 0.5000 + 0.1667 + 0.0417 + 

+ 0.0083 + 0.0014 + 0.0002 + 0.0000 

Every subsequent term here is obtained by dividing the foregoing 
term by the next integer. As it is seen, the terms of the series mani- 
fest an obvious tendency to decrease fast and, in addition, they 
soon become less than the required degree of accuracy. Now sum- 
ming up and rounding the result we obtain e = 2.718. 

In a way similar to that of deducing (55) it is possible to receive 
the following formulas (we leave this to the reader): 

cosa:=l--4 + 4-ir+--- < 56) 

sin £ = £ — + — -ff + • • • (57) 

cosha: = l -j-- 2 j-+-^j- + -gr+ ■ • • (58) 

sinha:-=a: + ^j--F-^- + -^j--{- . . . (59) 

(l+z) p = l + (J) x+( p 2 )x*+(1)t>+... (60) 
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(for any arbitrary p) wliere the binomial coefficients ( p j are defined 

by the formulas ( p ) = P, ( 2 ) == ~i-2 ^ ‘ ” ( & ) = 

= p(p-V ■■■^P- k ±21. If p is a positive integer, (p^i) = 


= J P 2 J = . . . =0 and] series (60) turns into a® finite sum. 

We thus obtain the expression for binomial coefficient. 

For the, logarithmic function we similarly derive 


r- x 3 X , 

ln(I + x) = « 2"+-3 4-+ • 


(61) 


J.L. Lagrange (1736-1813), a prominent French mathematician, 
proved that the remainder R n ( x ) in formula (47) admits the repre- 
sentation 


Rn (x) = 


[j) 

(«tD! 




where § is a certain value lying between a and x. This representa- 
tion sometimes makes it possible to find out for what values of x 
formula (53) holds because the formula is true if and only if R n (x) 

0 as n -> 00 . For instance, if ive take into account that —7 >- 0 

for any h it is easy to prove that formulas (55)-(59) are true for any 
x (think it over!). 

Taylor’s series can be rewritten in another form if we denote 
x — a = Ax; / (x) — f (a) — A/; f (a) (x — a) — f (a) Ax = df; 
f" (a) (x — a) 2 = f" (a) (Ax ) 2 — d 2 f (see Sec. 12) etc. Then we deduce 
from (53), 


M-df + -fr - ^ 


d 3 f 

3! 


+ 


. i^.4. 

+ nl + 


(62) 


Truncating this formula we obtain (for small Ax) approximate for- 
mulas (more and still more accurate): A / ^ df [accurate to a term 

of the order of (Ax) 2 ], A f df ~d?f [accurate to a term of the 

order of (Ax) 3 ] and so on. 


§ 6. Intervals of Monotonicity . Extremum 

17. Sign of Derivative. Let us consider a function y — f (x). 
We assume throughout this section that the function and its deri- 
vative have no discontinuities. A sketch of the graph of the function 
is shown in Fig. 121. Since y' = tan a (see Sec. 3) the function in- 
creases on every interval where its derivative is positive and decrea- 
ses on every interval where the derivative is negative. In other 
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words, if the rate of a variable is positive the variable increases 
and if the rate is negative the variable decreases. 

Since the derivative should pass through the zero value in a con- 
tinuous transition from its positive values to its negative values 
there must be y' = 0 at those points where an interval of increase 
borders on an interval of decrease (of course, the same takes place 
when there is a passage from negative values to positive values of 
the derivative). A point x at which f {x) = 0 is called a critical 
point of the function y = / ( x ); the instantaneous rate of change 



of the function is equal to zero at such a point, that is critical points 
serve as if they were “points of instantaneous state of rest”. There are 
three critical points in Fig. 121: a, b and c. The corresponding values 
of the function are called its critical (or stationary) values. 

From what has been said it follows that to determine the intervals 
of monotonicity of / (x) it is necessary to indicate all the critical 
points, of the function on the x-axis and then to investigate the sign 
of /'on each interval lying between neighbouring critical points. 
The intervals where /' > 0 will be the intervals of increase of / 
and the intervals on which /' < 0 will be the intervals of decrease 
of /. In case the sign of f is the same in two neighbouring intervals 
these intervals form an entire interval of monotonicity; thus, in- 
tervals III and IV in Fig. 121 constitute a whole interval of increase 
of the function f (x). 

It is also obvious that the function f is constant on an interval if 
and only if f (x) == 0 on the interval. Indeed, the function can neither 
increase nor decrease on its interval of constancy. 

18. Points of Extremum. If the value f (x 0 ) at some point x = x 0 
is greater than all the “ neighbouring ” values of the function (i.e. greater 
than the values of f (x) taken for x lying sufficiently close to x 0 ) 
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then the point x 0 is called the point of maximum of the function / (* 0 ) 
(or we sav that / (x 0 ) has a maximum at the point x — f oh and / (* 0 ) 
L called its maximal value . The point of minimum and the mmunal 
value of a function are defined in a similar way. Thus, the function 
shown in Fig. 121 has a point of maximum at x = a ana a point 
of minimum at x = b. In other cases a function can have any other 
number of points of maximum or minimum and if a function is 
continuous its maxima and minima must alternate. For example, 
the function in Fig. 122 has three points of maximum and two points 



of ’minimum; there is an infinitude of maxima and minima in Fig. 46 
whereas there are no such points at all in Fig. 44. 

A maximum or a minimum of a function is called an extremum 
which means “an extreme value” of the function. From Sec. 17 
it follows that points of extremum are such points at which the deri- 
vative changes its sign in moving through them from left to right. 
More definitely, if the sign of f (x) changes from -f to — while x 
moves through a point x = a in the positive direction then the 
function f has a maximum at x = a because the function increases 
on the left of x — a and decreases on the right of a : — a (see Fig. 121). 
Similarly, the derivative changes its sign from — to + in moving 
through a point of minimum. 

It follows that under the conditions specified in Sec. 17 all the 
points of extremum of a function are its critical points. This necessary 
condition for an extremum was, in fact, formulated by P. Fermat. 
As it is seen in Fig. 121 this condition is not sufficient, i.e. a critical 
point may not he a point of extremum. 

There are various sufficient conditions for the existence of an 
extremum but they are used more seldom than the necessary con- 
dition because in many concrete problems one often knows before- 
hand that the extremum must exist. Its approximate location is 
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also usually known and it is only its exact value that remains un- 
known. If the necessary condition indicates only one possible point 
of extremum in the above circumstances then the extremum is sure 
to he there. In case there are several extrema it is possible to find 
them and determine the intervals of monotonicity (see Sec. 17) 
simultaneously. 

Since the values of a function change very slowly near a critical 
point Fermat’s condition implies that if a point of extremum is 
determined with an error then the error of evaluating the corres- 
ponding extreme value is of higher order of smallness. Therefore 
it is usually convenient (if it is possible) to reduce a problem of 
determining some quantity to the problem of evaluating an extreme 
or even a stationary (critical) value of a certain function. Then even 
a rough determination of a point of extremum yields a good ultimate 
result. 

Conditions for an extremum can also he established on the basis 
of Taylor’s formula (see Sec. 15). Let us investigate a point x — a 
for a function / ( x ). It is seen from formula (49) that there is no 
extremum at x — a if /' ( a ) 0 since a change of the sign of h 
results in the change of the sign of /' (a) h and therefore the sign 
of the difference f (a + h) — f (a) also changes [because the terms 
of the order of h- are negligibly small compared to /' (a) h for small 

I h |], 

If /' (a) = 0 and f" (a) 0 then, by formula (50), we conclude 

in like manner that there will be an extremum at x = a. Namely, 
there will be a minimum at x = a provided /" (a) > 0 [since in 
this case / (a -f h) > / («) for small | h |] and a maximum provided 
f" (a) < 0. In case /' (a) = 0, /" (a) — 0 and /"' (a) 0 formula 

(51) implies that there will be no extremum at x — a. If /' (a) = 0, 
/" (a) = 0, /'" (a) — 0 and / IV (a) =£ 0 then an extremum exists 
again and so on. 

19. The Greatest and the Least Values of a Function. Let a func- 
tion y = f (x) and its derivative be continuous on a closed interval 
a ^ x ^ b. Suppose it is necessary to find the greatest and the least 
values of the function. In Sec. 18 we considered extrema which were 
attained at interior points of the intervals. But here we must also 
take into account end-point extrema. For instance, the function in 
Fig. 123 has a minimum at the end-point x = a and a maximum at 
the end-point x — b but it also has two other extrema in the interior 
of the interval. Of course, the derivative of a function is not necessa- 
rily equal to zero at an end-point even if there is an extremum there. 

Further, it should be noted that in Sec. 18 we investigated relative 
extrema whereas now we are interested in the absolute maximum and 
minimum. Therefore in order to find the greatest value of the func- 
tion on the interval a ^ x ^ b it is necessary to determine all its 
maxima at interior points as well as its end-point maxima on the 
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interval and then to compare the corresponding maximal values 
with each other: the greatest of these maxima is just the greatest 
value of the function. The same procedure yields the least value 
of a function on a closed interval. To facilitate these calculations 
we can simply compare all the critical and end-point values of the 
function with each other: the greatest of the values will he the abso- 
lute maximum and the least one will be the absolute minimum. 

A continuous function / (x) may have a derivative / with dis- 
continuities. In such a case the transition from the increase of f 
to its decrease may take place not only at those points wheie f =* 0 




but also at some points of discontinuity of the function /'(x). In 
order to determine the intervals of monotonicity of / we should 
investigate the sign of /' and this can be carried out in the way the 
sign of / was determined in Sec. III. 15. If the derivative has a dis- 
continuity at a point x — a and changes its sign in the process of 
moving through the point x — a the function / (x) has a cuspidal 
extremum at x = a (see the cuspidal minimum in Fig. 35 and the 
cuspidal maxima in Fig. 106). A function / no longer changes slowly 
near its cuspidal extremum whereas it does change slowly near 
extrema attained at its critical points (see Sec. 18) where /' — 0. 

Thus, the comprehensive statement of the necessary condition for 
the existence of an extremum is the following: at a point of extremum, 
the derivative vanishes or has a discontinuity. 

The conditions for an extremum based on Taylor’s formula no 
longer hold for the points of discontinuity of a derivative. Only 
the condition based on the change of the sign of a derivative remains 
true in such a case. 

If a function / (x) itself has discontinuities the points of discon- 
tinuity may happen to be the end-points of some intervals of mono- 
tonicity of the function even in those cases when the derivative has 
the same sign on both sides from the point of discontinuity. For 
instance, as it is seen in Fig. 124, y' > 0 everywhere for x =^= a and 
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the necessary condition for an extremum we obtain 


dV 

dx 


— 2 (a — 2x) ( — 2) x -f- (a — 2x) 2 -l = (a — 2a:) (a — 6a:) = 0 


which implies Xj 


- 77~ and I/O 
2 


6 ' 


The conditions of the problem 


indicate that only x — — will do. that is this value yields the 
sought-for maximal volume which is equal to 

2 


t . / n a \- a 2 

1 max= [a — 2-g-J "g" = 27 a '‘ 


Example 3. Let us consider the problem of refraction of the light 
passing through the interface between two homogeneous (i.e. with 
the same properties at all points) and isotropic (i.e. with the same 
properties along all directions) media. Let us suppose first that 



Fig. 127 Fig. 128 

v, is the speed of light in the 
first medium; u. is the speed of 
light in the second medium; mm 
is the interface between the media 


the interface is plane. Draw a plane through the ray of light (see 
Fig. 128) and take points A t and A 2 on t he ray. Now we can use 
so-called Fermat's principle in optics which states that the ray of 
light propagating from Aj to An follows a trajectory such that 
takes the minimal interval of time for passing from A, to A„ in 
comparison with all the possible trajectories connecting A s and A 2 . 
According to the principle the point M (Aj and An are regarded as 
fixed) must be located in such a position that the time interval 


t = f, 


t -A. 

bl 




~\/h\ -j-x 2 ~[/ hr, (a — x) 2 


r 2 i/*i v* 

should be as short as possible. Applying the necessary condition 
for the existence of an extremum we obtain 

dt x a — x 


dx 


"l 


Vh f-rx 2 e 2 yil (a — x) 2 


- 0 


from which it follows that 

x a — x sin a‘i x _ a — x _ v l 

li v i t-i v 2 ’ sin a 2 h ' In ~~ v 2 
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Thus, we have deduced the well-known law of refraction: the 
sine of' the angle of incidence bears the constant ratio to the sine 
of tlie angle of refraction equal to the ratio of the speeds of light 
in both media. If now the interface is not plane the law of refraction 
will remain the same since the phenomenon of refraction depends 
only on the properties of the media in an infinitesimal vicinity 
of the point of refraction and the interface can he regarded as being 
plane in such a vicinity. 

Thus we see that we have managed to deduce a physical law 
on the basis of solving an extremum problem according to a general 
physical principle formulated in terms of an extremum. Such a 
principle states that a certain physical quantity must have an extre- 
mal value in real circumstances. 

A more sophisticated investigation shows that in Fermat’s prin- 
ciple (and in some other principles of this kind) the essential con- 
dition is not that the time taken by the light to pass a distance 
must be minimal or even extremal but that the time must assume 
a stationary value. In the latter form Fermat’s principle can be 
deduced from the wave theory of light. 

§ 7. Constructing Graphs of Functions 

Differential calculus gives us a general method of investigation 
of individual peculiarities of the graph of a given function which 
enables us to construct the graph more accurately and considerably 
faster than by the primitive method of plotting separate points of 
the graph as it was done in Sec. 1.14. The determination of intervals 
of monotonicity of a function described in § 6 is an important example 
of the method of constructing graphs. Besides, there are some other 
techniques for investigating graphs which are also of use. They 
will be discussed here. 

20. Intervals of Convexity of a Graph and Points of Inflection. 
Let a function y = / (x) have a graph of the shape shown in Fig. 129. 
We see that on the left of the point A and on the right of the point 
B the graph is convex upwards and it is convex downwards between 
A and B (see Sec. 1.24). The points A and B at which the convexity 
changes its character (i.e. the upward convexity transits to the 
downward one or vice versa) are the points of inflection; the graph 
intersects the tangents at these points though it forms the zero 
angles with them. 

In order to find the intervals of upward and downward convexity 
note that the tangent to a graph turns clockwise as x increases on 
every interval where the graph is convex upwards (for example, 
for a: < a in Fig. 129) and therefore the slope of the tangent decreases. 
This slope being equal to y ' , the graph is convex upwards or down- 
wards on the intervals of the rc-axis where y', respectively, decreases 
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or increases. These intervals can be found by investigating the sign 
of y" in the same way as the intervals of decrease and of increase 
of y were investigated by determining the sign of y' in Sec. 17. 



Therefore, the graph is convex upwards or downwards on those inter- 
vals of the x-axis where y" <C 0 or y" > 0, respectively. The points 
of inflection correspond to values of x such that in moving through x 
the second derivative y" changes its sign and y" is equal to zero at the 

points of inflection. It is sup- 
posed here that y , y' and y" 
have no discontinuities. If 
there are such discontinuities 
some of the points of discon- 
tinuity may be the end-points 
of certain intervals of con- 
vexity (upward or downward) 
and therefore all the discon- 
tinuities must be indicated on 
the x-axis while constructing 
the graph. 

21. Asymptotes of a Graph. 
A graph of y = / (x) may have 
vertical (i.e. parallel to the 
y-axis) and inclined (i.e. not 
(see Fig. 130). There maylbe 
any number of vertical asymptotes, even an infinite number (see, 
for example, the tangent curve in Fig. 47), and they are determined 
in the following way: if | y ! ->■ oo as x a (a is finite) then the 
straight line x = a is a vertical asymptote. 

There cannot be more than two inclined asymptotes corresponding 
to x oo and x—*~ — oo and they are found as follows: let a straight 
line y = kx + b be an asymptote of the graph of y = f (x) as 



parallel to the y-axis) asymptotes 
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x-y oo. Then (see Fig. 130) the difference 8 = y a s — ygr is equal to 

8 = (kx + b) — ■ f (x) (83) 


and tends to zero as x ->■ oo. Whence, 




X X 


that is 

Besides, by (63), 
that is 


k = lim 

X-+CO X ■ 

f {x) — kx = b — 8 b 
b = lim [/ (x) — ftx] 


Each of these limits must exist and be finite, otherwise there will 
be no asymptote as x — y oo. If these finite limits do exist then the- 
asymptote also exists since it is seen from the last equality that 
the value [/ (x) — kx] — b tends to zero as x -v oo, i.e. 


lim [/ (x) — (kx 4- b)] — 0 

X-+oo 

22. General Scheme for Investigating a Function and Constructing 
Its Graph. This scheme for a function y = / (x) consists of the follow- 
ing rules. 

(1) We find the domain of definition of the function, its points- 
of discontinuity and zeros and then determine its intervals of posi- 
tivity and negativity. After that we investigate the behaviour of 
the function as its argument approaches the points of discontinuity 
and the end-points of intervals of definition (including the behaviour 
of the function at infinity). Further, we determine the asymptotes 
of the graph. We also find out whether the function is even, odd or 
periodic and so on. 

(2) We determine the points of discontinuity and zeros of the- 
derivative and then find the intervals of increase and decrease of 
the function, its points of extremum and the extremal values. Fur- 
ther, we investigate the behaviour of the derivative in approaching- 
its points of discontinuity, the points of discontinuity of the function 
itself (in case the function has finite jumps at these points) and the- 
end-points of the intervals on which the function is defined (if these- 
end-points are finite and the function has finite values at them). 

(3) We next determine the points of discontinuity of the second 
derivative and its zeros and then find the intervals on which the 
function is convex upwards or downwards and also the points of 
(inflection. It is also useful to determine the direction of the tangent, 
at the points of inflection. 



176 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


All tlie points thus found should be plotted in the coordinate 
plane and then the graph itself is constructed. The shape of the 
graph must reflect all the individual peculiarities of the behaviour 
of the function. If the elements of the graph indicated above do not 
describe the behaviour of the graph clear enough it is desirable to 
plot several additional points by calculating the values of y for 
some specifically chosen values of x. It is also expedient to determine 
the direction of the tangent at those points after computing the 
corresponding values of y'. 

We shall represent, as an example, the investigation of the be- 
haviour of the graph of the function y = 3? — 2a: 2 . In this case 

the domain of the function is the whole rr-axis — oo < x •< oo; 
there are no points of discontinuity. Putting y = 0 we see that 
the function has two zeros: x t — 0 and x 2 = 2. Therefore there are 
three intervals of retention of the sign: — oo <; a; < 0, 0 < a: < 2 
and 2 < x < oo. Substituting arbitrary values of the argument taken 
from these intervals we see that the function is negative on the first 
and the second intervals and positive on the third one. There are 
no vertical asymptotes. Determining inclined asymptotes in accor- 
dance with Sec. 21 we find (the reader should check it up!) that one 

and the same straight line y = x ^ serves as an inclined asymp- 

tote both for x -*■ oo and x -*■ — oo. After computing the derivative 


3x 2 — 4x 3x — 4 

3 f(x 3 — 2x2)2 “ 3 f(x- 2f-x 


we see that it has discontinuities (approaches infinity) as x 0, 
x-^2 and vanishes at x = . Thus we have four intervals of mono- 

tonicity: — oo < a; < 0, 0<a;<4-, 4- < a: < 2 and 2 < x ■< oo. 

Substituting arbitrary values from these intervals into y' we find 
that only the second one is an interval of decrease and all the other 
intervals are intervals of increase. Therefore the third and the fourth 
ones form an entire interval of increase. Hence, changes of the 
character of monotonicity occur at x = 0 (where there is a maxi- 
mum with the maximal value y — 0) and at x = y (where there 


is a minimum with the minimal value y — — 


2 Y\= -1.058). 


Computing the second derivative we obtain, after some transfor- 
mations, the expression 


V = 


8 

9f (x— 2)6*4 


(verify this expression!). The only discontinuities of the second 
derivative are the points x — 0 and x = 2 where the discontinuities 
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)f the first derivative are placed, and there are no zeros of the second 
ierivative at all. Thus, there are three intervals inside which “the 
;haracter of convexity” is invariable: — oo < a: < 0, 0 < £ < 2 
md 2 <i x <C oo. Now we substitute arbitrary values taken from 
,hese intervals into the second derivative, and then the sign of the 
ierivative shows that the function is convex downwards on the 
irst and on the second intervals and is convex upwards on the third 
me. Let us, in addition, calculate for x = — 1 the values 

V= — 1>/3= -1.44 and y' = — (==«1.12 
J 1 ‘ J 3 f 9 

'or x — 1 , 

ij — — 1 and y' — — y = —0.33 

ind for x = 3, 

y = , 3 /9 = 2.08 and y ' =— = 

1 3 f 3 

The constructed graph is show in Fig. 131 where those points 
vhich were calculated are depicted as circles (we leave it to the 



•eader to check that all the individual peculiarities of the graph 
ire indeed represented here). 

The disposition of the graph relative to its asymptote can be 
iasiiy found on the basis of its asymptotic expansion, that is an 
expansion which holds for sufficiently large values of [ x [. This 

2-0141 
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expansion, in its turn, follows from formula (60): 


y^V x 3 2a: 2 = ]/. a? (l -J-) =« (l ~4) 3== 



= x — — — — 4- infinitesimal terms of higher order 

relative to ~ as | x | oo (64) 

2 

Hence, we have y < x ^ = y as for large x > 0 and y > y„, 

for large x < 0 as | x i oo (y as denotes here the ordinate of the 
asymptote). Besides, from equality (64) it straightway follows that 

y — (z— 4) rr - *' 0 ( 05 ) 

\ of J X |-+oo 

Hence, if we had not known the equation of the asymptote y — 
2 

= x g- before we could deduce it from (65). Therefore, we have 

established one more method of finding an inclined asymptote. 



CHAPTER V 


Approximating Roots 
of Equations. 
Interpolation 


§ 1. Approximating Roots of Equations 

1. Introduction. We shall discuss here some methods of calculating 
a numerical solution of an equation of the form 

f (x) = 0 (1) 

where / is a given function. Such an equation may be algebraic in 
case the function / is algebraic or transcendental if otherwise. We 
shall call both types of equation (1) finite to distinguish them, for 
example, from differential equations etc. Here we shall represent 
only some of the most important methods of solving equations of 
form (1); for other methods the reader is referred to special courses 
on calculus of approximations. 

The process of numerical solution of equation (1) usually begins 
with finding a rough, approximate value of a root which is called 
the zeroth approximation (the initial approximation). If a certain 
physical problem is being considered such an initial approximation 
may be known from the real physical conditions of the problem. 
We can also begin with constructing an approximate sketch of the 
graph of the function /. If doing this we find that the function is 
continuous over a closed interval between a and b and assumes 
values of opposite signs at the end-points a and b then we are sure, 
by the properties of continuous functions (see Sec. III. 14), that 
there exists at least one zero of / on the interval, that is equation 
(1) has at least one root there. Besides, there must be only one root 
of / on the interval provided the function / is monotonic between a 
and b. In the last case the root is separated from other ones. If we 
denote the unknown root by a then a is sure to satisfy the inequality 
a < a < b. Different methods are used for further specification 
of the value a (see Sec. 2). 

It is sometimes more convenient to rewrite equation (1) in the 
form cp (x) — ^ ( x ) and then to find the intersection point of the 
graphs of y ' — q> (x) and y — tJj (x). Doing this one tries to break the 
ieft-hand side of equation (1) into two summands in such a way that 

12 * 
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this should yield some well-known or, at any rate, simpler graphs. 
An appropriate substitution for the variable x may also be some- 
times of use. 

For instance, let us take the equation 

tan ax 2 — bx 2 — 0 (2) 

where a and b are given numbers. The change ax 2 = s reduces (1) 
to the equation 

tans = /cs = (3) 

The graphs of the left-hand and right-hand sides of equation (3) 
are shown in Fig. 132. It is clear that we are interested in the values 
s>0 only. We see that equation (3) has an infinitude of roots 



s 0 = 0 < < s 2 < . . and therefore equation (2) also has infini- 

tely many roots. The dependence of the roots of (3) on k which can 
be easily seen in Fig. 132 defines the dependence of the roots of the 
original equation on the parameters a and b. We see that for k > 1 

there appears a new root lying in the interval 0 < s < y (why 
is it so?). 

We can easily derive an asymptotic expression for the solution of 
equation (3) valid for large n. For definiteness, let k < 1. Then 

Fig. 132 implies the desired expression s n = mi -f- -2 a n 

(a„'>0) where a n ->0 as n -»• oo; this can be rewritten (see the 








APPROXIMATING ROOTS OF EQUATIONS. INTERPOLATION 


181 


notation of Sec. III.il) as = nit + £ + o (1). If it is necessary 
to specify this expansion we can substitute it into (3) which results in 

tan — a n j = k [tin +-£ — a «) 

and after simple transformations we obtain 

cosan = k — a n) sin a n 

From this we deduce 


(4) 


(Xji sin ctn 


cos a n 

1 

i.e. 

CLn = 

B f 4 

1 1 - 
- 0 1 

k a n j 

knn ’ 


knn 1 

V n , 


If further specification is desired then we may, for example, denote 


_ = t 0 , a„ = a (t) 


<->o 


■*- 0, which yields 


2 cos a = & it -f- ^ — a) sin ct (a = a (/); a (0) = 0) 

Now it is possible to put down several terms of the expansion of 
a {t) into Maclaurin’s series [of form (IV.54) but in powers of *]. 
The calculations which we leave to the reader give 


a== Tjr 


2/x7C 


, 1 

M, 

L 1 . 

1 

' kn 

l 4 ^ 

An 2 

3A 2 Jt 2 


cW ) 


1 


knn 


2Ajtn 2 


Now on the basis of formula (IV. 60) we obtain an asymptotic 
expression for the positive roots of equation (2) for large n: 


s n _ a ~ 

2 j/ m t + 

l 

Tt 

1 , 

1 

a 

2 

knn 1 

2Ami 2 

l 

_(nn\ 

2 (14- 1 - 

- 1 1 

\ 2 _ 

\ a ) 
1 

m t \ 2 

\ l ^ 2n 

An 2 n 2 1 ’ 

"I 

[-1+ 1 ( 

' 1 

, 1 

\ 1 . 

a j 

L 1 4n \ 

, 2 An 2 

‘ 32 

) n n - + * * * 


2. Cut-and-Try Method. Method of Chords. Method of Tangents. 
We begin with the cut-and-try method. Its scheme is the following. 
Let, for definiteness, / (a) < 0 and / ( b ) > 0. We first take an arbi- 
trary value c between a and b and compute / (c). It should be noted 
that it is the sign of / (c) that is important here' but not the value 
/ (c) itself. Now let us suppose that we obtain f (c) > 0. This means 
that we have “a shot over the target”, “a plus round”, and therefore 
a < a < c. Further, we take some value d between a and c and 
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compute f {d)\ if / (d) < 0 then we have “a minus round", i.e. d < 
<a<c and so on. The values c, d, ... may he taken more or 
less arbitrarily. It is better, of course, to choose them in such a way 
that the calculations should be simpler. At the same time it should 
be noted that if, for example, | f (a) | is much smaller than / ( b ), 
it is quite likely that a is closer to a than to b and therefore it is 
better to take c closer to a and so on. 

The method of chords prescribes to take for the point c not an 
arbitrary point but the point of intersection of the chord drawn 
through the points M [a, f (a)] and N [6, / (6)] with the rr-axis 
(see Fig. 133). In other words, we act as if we approximately replaced 
the arc of the graph by a line segment. This means that we are carry- 
ing out the linear interpolation which looks sufficiently justified 
provided the interval (a, b) is not too large. In order to find the 
point c let us write the equation of the chord MN [see equation 
( 11 . 22 )]: 

y—1(a) x—a 

f(b) — f(a) b—a 

Now putting y — 0 we obtain the corresponding value x — c: 


f(a)(b-a) _ L / (b) (b — a) 
f(b)~f(a) 0 1(b)- f (a) 


( 5 ) 


The procedure can be repeated several times if it is necessary (see 
Fig. 133). 

In the method of tangents (also called Newton’s method) we choose 
the point of intersection of the tangent line drawn to the graph 
through one of the end-points of the considered arc with the z-axis 
as the point c. The equation of the tangent shown in Fig. 134 has 
the form [see equation (IV.5)1 

y-f(b)=r ( b ) (* - b) 

From this, putting y — 0, we derive 


c = b 


1(b) 

r (b) 


( 6 ) 


The procedure described here may also be repeated several times 
(see Fig. 134). 

Newton’s method may be interpreted irrespective of its geometric 
meaning. Let us denote the zeroth approximation as x 0 and expand 
the left-hand side of (1) in powers of x — x 0 on the basis of Taylor’s 
ormula (IV.53); this yields the equation 

/ (*o) + ( x — x o) -r 1 2 ^ ( x ~ x o) 2 + 

If now we carry out the linearization, that is if we drop all the terms 
that are infinitesimals of orders higher than the first, we shall get 
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the linearized equation (1): 

/ fco) + f (^-o) ®o) = ® 


The solution of this equation is 

_ ___ /(* o) 

X i- X ° j'(x 0 ) 

It can be taken as the first approximation of the root of equation (1). 
Thus we arrive at the same formula (6). The second approximation 
can be obtained from the first approximation by using the formula 


_ / (xj) 

X2 — Xi ]' (i t ) 


(?) 


and so forth. Newton’s method always leads to the aim provided 
the zeroth approximation does not lie too far from the desired root. 

The following modification of Newton’s method is sometimes 
used: the denominators in formula (7) and in the formulas for the 



further approximations are replaced by /' (z 0 )- This means, geo- 
metrically, that all the inclined lines in Fig. 134 are drawn so that 
they are parallel to the tangent at the original point N. The con- 
vergence of the modified method is a little worse compared with 
the original scheme but the calculation of each approximation is, 
naturally, simplified. 

The combined method is based on the following consideration. 
If the segment of the graph in question has no points of inflection 
and is not broken the method of chords and the method of tangents 
give the points lying on different sides of the desired root. If, for 
example, a graph is situated in the way shown in Fig. 135 then, 
beginning with the interval {a, b), we can determine the point a { 
by the method of chords and the point b y by the method of tangents. 
This will result in a new interval (a u fq) containing the desired 
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root a. Repeating the analogous procedure for the interval (a,, l 
we again obtain a new interval ( a 2 , b 2 ) containing the desired ro 
etc. Thus, we obtain in succession the two-sided approximation 
The approximation process is stopped when the required degree 
accuracy is attained. 

Let us take, for example, the equation 

o? + x 1 — 3=0 ( 

We shall consider the coefficients of equation (8) to be quite exac 
The investigation of the derivatives indicates that the left-har 



side of (8) which we denote by f (x) increases from — oo to — 2-, 


2 

for x increasing in the interval — oo < x < — ^ , decreases to — 
for — ^ <; x <i 0 and then again increases to oo for 0 < x < ° 

O 

Besides, / (x) has only one point of inflection at x = — ^ (check 

up!). Consequently, the equation has a unique real and positi' 
root a. Since / (0) = — 3, / (1) = — 1 and / (2) = 9 (see Fig. 13 
■we have 1 < a < 2. Calculating in accordance with the cut-an 
try method we obtain / (1.1) = — 0.459 and / (1.2) = 0.16 
i.e. 1.1 < a < 1.2 (a crude estimation of a root is usually obtaim 
by the cut-and-try method). Now putting a = 1.1 and b — 1 
we apply formulas (5) and (6) (i.e. we use the combined method 


a t = 1.2 


0.168 x 0.1 
0.168 + 0.459 
0.168 
6 72 — 


= 1.173, 
1.175 


6j= 1.2 
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Hence, we can put a — 1.174, this value being accurate to 0.001. 
If the accuracy is insufficient we can proceed to further calculations: 
/ (1.174) = —0.003628 (i.e. we have a “minus round”, the calcu- 
lations being accurate to 10 -6 ) and / (1.175) — 0.002859. Taking 
now a = 1.174. b = 1.175 and calculating by tbe combined method 
we obtain the values <z 2 = 1-1745593 and & 2 = 1-1745596 accurate 
to 10 -7 . Hence, with an accuracy of 0.000001 we can assume a = 
= 1.174559. Note how fast the degree of accuracy increases! 

3. Iterative Method. The methods described in Sec. 2 belong to 
the class of iterative methods (or, in other words, to the class of 
methods of successive approximations). A characteristic feature 
of all these methods is the successive iteration of one and the same 
scheme during the calculation process. This uniformity, i.e. the 
repetition of one and the same process, has many advantages. In 
particular, it is very convenient when we use digital computers. 

The general form of the iterative method applicable to equation 
(1) is the following: the equation is rewritten in the equivalent form 

x = ip (a:) (9) 

Then we choose a certain value x — x 0 as the zeroth approximation. 
It is desirable, of course, that x 0 should be as close as possible to 
the sought-for root if we have some information about it. The sub- 
sequent approximations are computed by the formulas x t — <p ( x 0 ), 
x 2 = tp (x t ), . . ., or, generally, 

* n+ i = <p ten) (10) 

There can be two cases here. 

(1) The process may converge, that is the successive approxima- 
tions x n tend to a limit x as n ->• oo. In this case we can pass to the 
limit_in formula (10) which yields x = <p (x). Thus, we see that 
x = x is a root of equation (9). 

(2) The process may diverge, that is there can be no finite limit 
for the “approximations" thus constructed. But this fact does not 
necessarily imply that there is no solution of equation (9) because 
it might simply occur that the iterative process was constructed 
in an inappropriate way. By the way, it may happen that even in 
the case of convergence we obtain some other solution (which may 
have no physical meaning) quite different from the desired root 
in whose vicinity xq has been chosen. 

We shall demonstrate these possibilities by taking an example 
of a very simple equation which can be easily solved: 

*=T + 1 (11) 

The equation has an obvious solution x = 2. If we put x 0 = 0 and 
calculate with an accuracy of 0.001 we shall obtain = 1.000, 
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x 2 = 1.500, x 3 = 1.750, x 4 — 1.875, x 5 = 1.938, x 6 — 1.969, x 7 = 
= 1.984, x 8 = 1.992, x g = 1.996, x 10 = 1.998, x H — 1.999, x 12 — 
— 2.000 and x i3 = 2.000, that is the process has “practically con- 
verged”. 

If we take the equation 

* =S T5' +1 

instead of (11) and assume x 0 — 0 then we receive the values aq = 
= 1.000, x 2 — 1.100, x 3 = 1.110, x 4 = 1.111 and x 5 — 1.111 accu- 
rate to 0.001; thus, the process practically converged after four 
iterations. 

If we solve equation (11) for x entering into the right-hand side, 
that is if we rewrite (11) in the equivalent form 

x = 2x — 2 (12) 

and begin with x 0 =0, we shall get the sequence Xy — — 2, x 2 = 
= — 6, x a = — 14 etc., that is the process will not converge. We 
could have forecast this result if we had observed that formula (10) 
implied the equality 


*n x I — x n = q) (x„) — (p (x n _ t ) 


(13) 


that is x 2 — x t = (p (arj) — cp ( x 0 ), x 3 — x 2 = (p (x 2 ) — cp (xj) and 
so forth. In case a function changes more slowly than its argument 
or, more precisely, if 

| (p (x) — cp (x) | ^ k | x — x j ( k = const < 1) (14) 

the distance between the successive approximations rapidly appro- 
aches zero and the iterative process converges. The smaller k, the 
greater the speed of convergence. Inequality (14) must be fulfilled 
for all x and x or, at any rate, near the sought-for root x of equation 
(9). We shall show in Sec. 4 that inequality (14) holds provided 
I <p' ( x ) I < *• 

We see that equations (11) and (12) are equivalent but generate 
different iterative processes. In other cases equation (1) may be 
rewritten in form (9) in many different ways, each of these ways 
generating its own iterative process. Some of these processes may 
happen to converge fast and therefore are more convenient, some 
of them may converge slowly and, finally, some of them may simply 
diverge. In particular, it is easy to verify that if we rewrite equation 
{1) in the form 

} (z) (&— f) 


x = x 
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and begin with x 0 = a [see formula (5)] this will yield the method 
of chords. Similarly, if equation (1) is rewritten m the form 


then we arrive at the method of tangents. 

There exists a comprehensive theoretical investigation of the 
problem of convergence of the iterative method- But in more com- 
plicated problems than the above ones it is often easier to compute 
several approximations. Then judging by the results we can draw 
the necessary conclusion as to the convergence of the process without 
giving any theoretical proof. If we 
see that an approximation differs 
from the preceding one by a small 
quantity (for instance, if their diffe- 
rence is less than the required 
degree of accuracy) we have every 
reason to stop the iterative process. 

At any rate, such a situation shows 
that the approximation satisfies 
equation (9) with a good accuracy 
because [ x n — x n+1 | < h implies 
I x n — <p (Xn) | < h. 

4. Formula of Finite Increments. 

Inequality (14) can be verified by 
means of the so-called formula of finite increments we are going 
to deduce here. Let us suppose that a function y — <p (x) is conti- 
nuous over an interval a ^ x ^ b. Consider the graph of the func- 
tion on the interval and draw the chord MN connecting the end- 
points of the arc (see Fig. 137). Let the derivative of the function 
also be continuous. Besides, we suppose, for definiteness, that there 
is a portion of the graph lying above the chord. 

Now we draw a straight line ll parallel to MN and lying above 
the graph. Imagine that we begin to lower the line so that it should 
remain parallel to MN. Then there must exist a moment when the 
straight line touches the graph at a point p. Thus, there is at least 
one point lying on a smooth arc such that the tangent to the arc at this 
point should be parallel to the chord connecting the end-points of the 
arc. If now we equate the slopes of the chord and of the tangent we 
receive the formula 

~ ( |=-I (a) = 9' (g). i-e. q>(6) — q> (a) = q>' (c) (6— a) (16) 



where c is a point lying between a and b. Formula (16) is called the 
formula of finite increments (since the distance from a to b may ton 
be small) or Lagrange’s theorem. We must note that the value c 
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entering into formula (16) is not at all an arbitrary number i 
the given function and the given interval (c, b) although there m 
be several values that may serve as c. For instance, looking 
Fig. 137 we see that we can as well take c* in place of c in formt 
(16) because the tangent at the point p * is also parallel to the cho 
MN . The exact value of c is usually unknown when we apply f< 
mula (16) but it is often enough to know that c is placed somewht 
between a and b. 

For example, suppose that | cp' (x) | ^ k on an interval. Th 
applying formula (16) to two arbitrary points x and x taken fr( 
the interval we see that 1 <p (x) — cp (z) | ^ k | x — x | for th< 
[see formula (14)1. 

It also follows from formulas (13) and (16) that if the successi 
approximations are placed not far from the exact solution x, a 
therefore q>' (x) changes very little, the speed of convergence of t 
iterative process is approximately that of a geometrical progressi 
with the ratio cp' ( x ). If the differences between successive appro: 
mations formed an exact geometrical progression [as in example (1 
its first term would be a = x t — x 0 and the ratio would equal q 

— flT - f L. Therefore the sum of such a progression, that is t 

T i — x o 

difference x — x 0 , would be equal to 


and therefore 


a _ xi — x 0 _ (xi — x 0 )~ 

1 — q ^ x 2 — x\ 2ij — Xq — x 2 

XI— Xg 

- I (*1 — x o)“ _ x\ — X 0 X 2 

0_t ~ 2X!-X 0 -X 2 2xi — x 0 -x z 


(' 


In more complicated cases the successive differences of approxin 
tions only resemble tbe terms of a geometrical progression. Th 
formula (17) does not yield an exact solution but makes it possil 
to omit several approximations and to get an approximate val 
of a root which may again initiate a new iterative process. 

Newton’s iterative process is of special importance. Indeed, 1 
derivative of the left-hand side of (15) is equal to 

, f'f'-ff" it" 

1 Y 2 Y 2 

and vanishes at x = x since f (x) — 0. Hence, by the precedi 
considerations, Newton’s iterative method converges faster th 
any geometrical progression with an arbitrary ratio. The rate 
this convergence may be easily illustrated by the following typic 
example. Let us consider the approximations obtained by Newtoi 
method for the root x = 0 of the equation x + x 2 = 0. These appi 
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ximations are connected with each other by the relation 


x n + x n _ 4 

X n+l — Xn l-J-2x n l+2x n 


O 



To estimate the rate of convergence let us replace this approximate 
equality by the exact one. Then we get in succession x x — x%, x 2 = 
= x\ = 3 $, x 3 = x* = x s 0 etc. Generally, x n = xf. The right-hand 
side xf 1 tends to zero as n — > oo, for j x 0 | < 1, faster than any 
exponential expression. 

5. Small Parameter Method. The small parameter method (the 
perturbation method), as well as the iterative method, is one of the 
most universal methods in mathematics. Here is the general idea 
of the method. Let us consider a problem involving some unknown 
quantities and, in addition, a parameter a. Suppose that it is not 
too difficult to solve the problem for a certain value a = a 0 (this 
is the so-called unperturbed solution ). Then the solution for a lying 
•close to a 0 (the so-called perturbed solution) may be in many cases 
obtained by expanding the solution in powers of a — a 0 , with 
a certain degree of accuracy, by means of formulas similar to for- 
mulas (IV.49), (IV.50), (IV.51) etc. Obviously, the first term of 
such an expansion does not contain a — a 0 and is obtained for 
a = a 0 , i.e. it coincides with the unperturbed solution. The sub- 
sequent terms yield corrections to the unperturbed solution; these 
terms are infinitesimals of the first, second etc. orders (relative to 
<z — a 0 ). These terms are usually computed by the method of unde- 
termined coefficients, i.e. the coefficients in (a — a 0 ), (a — a 0 ) 2 
etc. are denoted by letters and the values denoted by the letters 
are then found on the basis of the conditions of the problem. This 
method gives a good result only for values of a which are close to 
oc 0 . The smaller | a — a 0 |, the smaller the number of terms that 
should be computed to attain a desired accuracy. It is often con- 
venient to choose a parameter so that cx 0 = 0; then the difference 
■a — a 0 — a is considered small and this accounts for the term 
'‘the small parameter method”. The number of terms that must be 
taken may be determined by a method similar to the one used in 
the end of Sec. III. 6. It should also be noted that an attempt to use 
the small parameter method for large [ a — a u | may lead to principal 
mistakes because the dropped terms may be more significant than 
the retained ones in this case. 

Thus, the small parameter method makes it possible to obtain 
a solution of a problem that is formulated in terms which are close 
to the terms of a certain “main” problem provided, of course, this 
change of the formulation does not yield a principal, qualitative 
■change of the solution. Even determining the first term containing 
a parameter often enables us to make some useful conclusions con- 
cerning the dependence of the solution on the parameter. 
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Example. Let us solve the equation 

x? — ax 2 + 1=0 (18) 

for small ] a | with an accuracy up to the term a 3 inclusively. In 
order to do this note that the value a = 0 yields the equation x % + 
+ 1=0 which has an obvious root x 0 = — 1. Therefore we put 
down the expression 

x a = — 1 + aa + ba 2 + ca 3 + infinitesimals of higher order 
Substituting this expression into (18) and taking terms only up 
to the order of a 3 we receive (check it!) 

( — 1 + 3 aa + 3ba~ — 3 ara 2 — 6 aba, 3 + 3ca 3 + a 3 a 3 ) — 

— a (1 — 2 aa — 2 ba 2 + a 2 a 2 ) + 1 + 

+ infinitesimals of higher order = 0 
From this, equating coefficients in the same powers of a, we derive 
3 a — 1 = 0, 3b — 3a 2 + 2a = 0 and — 6 ab + 3c + a 3 + 2b — 

o 11 

— a- — 0. Now we find, in succession, a = - 5 - , b = — --and c = 

o y 

2 

= . Hence, we obtain the expression 


, . a a 2 2 a 3 

x a — ~ 1 +-3- 9 + 8I 


(19) 


for the root of equation (18) which is accurate to infinitesimals of 
higher order relative to a (namely, the error is of the order of a 4 ). 

Just the same result may be obtained by applying directly Tay- 
lor’s formula (IV. 51) in which we change the notation a little: 



Here the subscript “zero” points out that the value a = 0 is sub- 
stituted into the corresponding terms. Now let us differentiate 
equality (18) with respect to a in a manner similar to the one used 
in Sec. IV.ll: 


3a : 2 ~ — x 2 — 2ax ~~ — 0, 

da. da 


ex (^r ) 2 + 3x '^- 4x ^- 2a {-%) 2 - 2ax 


d 2 x 


da 2 


da 2 


0 , 


d 2 x 


— 6 a 


da 2 
dx d 2 x 
ua da 2 


da 2 


■2ax~ = 0 
da J 


Substituting a — 0 and x= — 1 we derive 

3 (+).-*=»• -s(+):++-&)„+h! a=°' 
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from which we obtain, in succession, 



From this it is seen that formula (20) implies expansion (19) which 
is sufficiently accurate for small | a |. 

The small parameter method is closely and directly related to 
the iterative, method of Sec. 3. We shall illustrate this connection 
by taking tile same example (18). It is always convenient to have 
an unperturbed solution equal to zero. To attain this let us make 
the substitution x = — 1 + y which yields 

— 1 + 3y — 3y 2 + y 3 — a + 2a y — ay 2 + 1 = 0 
i.e. 

y~- 3- a — ^ay + y--\- T ay 2 — r y 3 

If now we carry out iterations beginning with the value y 0 — 0 and 
dropping the infinitesimal terms of order higher than the third we 
get the desired result after three iterations. It is also easy to verify 
that it is permissible to neglect those terms of each approximation 
which are infinitesimals of an order higher than the number of the 
approximation. 

§ 2. Interpolation 

6. Lagrange’s Interpolation Formula. As it was shown in Sec. 1.22, 
the process of linear interpolation consists in an approximate repla- 
cement of a given function y = / (x) by a linear function y = ax -j- b 
coinciding with / (x) at two points. Obviously, the accuracy of 
such an approximation may be increased by taking a polynomial 
of degree n of the form 

P (*) = P n (x) = a 0 x n -f aix 71 " 1 + . . . + a n .!X + a n 
in place of a linear function. 

The polynomial P n (x) approximating the function / (x) contains 
n - F 1 parameters (i.e. its coefficients), and thus n + 1 conditions, 
in general, are needed to determine such a polynomial. Let us take, 
for the sake of simplicity, a polynomial of the second degree: 

P (x) = P 2 (x) — ax” + bx -F c 

(the general case is investigated quite similarly). To choose such 
a polynomial we must set three conditions. These conditions are 
often taken in the following form: the polynomial should coincide 
with the function / (x) at three given points: 

p (xd = f (xj), P (x 2 ) = / (x 2 ), P (x 3 ) = / (x 3 ) (21) 

These three values are also regarded as known: 
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It is quite evident that there can be only one such polynomial. 
Indeed, if another polynomial of the second degree Q (x) satisfied 
conditions (21) the difference P (x) — Q (x) (-which is also a poly- 
nomial of the second degree) would be equal to zero at x — x t , 
x = x 2 and x = x 3 . This implies that the difference is identically 
equal to zero (why is it so?). Thus, all the coefficients of the diffe- 
rence are equal to zero, and therefore Q (x) == P ( x ). 

Lagrange’s idea is to look for a polynomial P ( x ) in the form 

P (x) = A (x — X 2 ) (x — x 3 ) + B (x — X]) (x — x 3 ) -j- 

-f- C (x — xf) (x — x 2 ) (22) 

w'here A, B and C are some constants yet unknown. It is clear that 
this is a polynomial of the second degree. To find the constants 
A, B and C let us take conditions (21) and notice that substituting 
x = Xj, x — x 2 and x = x 3 into the right-hand side of formula (22) 
yields only one nonzero summand while the other two vanish. 
Hence we obtain 


/ (xt) = A (x, — X 2 ) (Xi — x 3 ), / (x 2 ) = B (x 2 — x,) (x 2 — x 3 ), 

/ (x 3 ) = C (x 3 — x t ) (x 3 — x 2 ) 


Now we find A, B and C from the latter relations and substitute 
them into (22). Thus we have deduced Lagrange's interpolation 
formula 


}{x) 

-f / (* 2 ) 


P 2 (x) = f( Xi ) 1 ^ l2){x 


-*3) 


(g— *1) (g— x i) 

(xz—xi) (x 2 — x 3 ) 


(*l 

■ 1 fe) 


x 2 ) (xj — x 3 ) 

(x — xj) (x- 


-x z ) 


(*3 — ^i) i x 3—^z) 


(23) 


For practical applications of the formula it is desirable that none 
of the differences x 4 — x 2 , xj — x 3 and x 2 — x 3 should be too small. 
(Think why this condition is important.) 

Conditions (21) may be replaced, for example, by the following 
three conditions: 


p (*i) = / (*,), p’ (xj = r (xj), p{x^=i (x 2 ) 

Then the polynomial P (x) may be looked for in the form 
P (x) = A (x — x 2 ) (x — 2x t + xa) + 

+ B (x — Xj) (x — x 2 ) + C (x — Xj) 2 

instead of (22). (Find the coefficients A, B and C for this case!) 

7. Finite Differences and Their Connection with Derivatives. 
Before proceeding to our further investigations let us introduce one 
of the important notions of modern mathematics, namely, the notion 
of a finite difference. Let y = / (x). Then, with h given, the expres- 
sion 


A h y = f (x + h) — f (x) 
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is called a finite difference of the first order of a function /' (the first 
difference of /). The expression 



f(x + h)~f(x) 
h 


is called the first difference quotient or the first divided difference. 
It follows from the definition of a derivative (see Sec. IV. 2) that 
for a sufficiently small h we have 


( 24 ) 

or, more precisely, 

y' = lim ~ Ahi/ 

/t-vO 11 


Let, for example, y — x 3 . Then 

My — (x 4 h) 3 — x 3 = 3x 2 h 4 3 xh 2 4 fi 3 
~ k h y = 3x 2 4- 3xh 4- h 2 

lim (~ My ) — lim (3x 2 4- 3xh 4 h 2 ) = 3x 2 = if 

h-+ o ' 11 ' A-*- o 

We also indicate here the following obvious properties of diffe 
rences: 

Aa (yi 4- # 2 ) — A/,Pi 4- A/,1/2) 

A h ( Cy ) = Ck h y ((7 — const) 

We can also take a difference of the first differences, the so-called 
second difference: 

My = M (My) = A* [/ (x + h)-f (x)\ = 

- [/ (x + 2fi) - / (x 4 A)] — [/ (x 4- A) — / (x)I = 

= / (x 4 2 / 2 ) — 2/ (x 4 L) 4 / (x) 

The second divided difference is defined in an analogous way: 

T A, ( ± &,y ) = ^ A„ (A„y> = -1- A U = 

Since taking a divided difference with a small step is approxi- 
mately equivalent to differentiating, the second divided difference 
for a small step is approximately equal to the second derivative or 
more precisely, " ‘ ! 

y" = lim ±- My - lim _ /(4+2 fe )-2/(. + fe)4/ (£)_ (2 

h-.o h v > 


13-0141 
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Thus, in the previous example 

Ah / = A* (3 x 2 h -j- 3 xh 2 -f- h z ) = 3 {x -f- h ) 2 A r 3 (x-f h) h 2 -f- h 3 — 

— 3x-h — 3xh 2 — h? — Qxh 2 -j- 6/r; 

j 

lim -Tg- A£y = lim (6a: -f 6k) = 6 x — y" 

It -* 0 “ h-vO 

The third difference A 3 h y = A h {A 2 h y) and the third divided diffe- 
rence -p- A ly (which tends to the third derivative y'" when passing 

to the limit) etc. are defined similarly. 

It is especially convenient for computing the differences when 
a function is represented by a table with a constant step h. In case 
a table is given in general form (1.2) we can write Ai/j = z/ 2 — Vi> 
Ay 2 — y 3 — U 2 and, generally, A y k — y kM — y h . The subscript k 
in the expression Az/ft now indicates the number of the difference 
and not the step because the step is regarded as fixed here. Further, 
A ~y { = A 1 J 2 — A;/ f , A ~y 2 = A y s — A y 2 and so on. For instance, 
let us take, for A = 0.1. the table 


X 

10.0 

10.1 

10.2 

10.3 

10.4 

10.5 

10 . c 

10.7 

• 

1 .00000 

1.00432 

1 .00860 

1 .01284 

1.01703 

1.02119 

1 .02531 

1.02938 

10 5 Af/ 

432 

428 

424 

419 

410 

412 

407 


10 5 A 2 i/ 

-4 

—4 

—5 

—3 

—4 

— 5 



10 5 A 3 f/ 

0 

—1 

2 

—1 

—1 





(This fragment is taken from the table of logarithms. The values 
of the differences are multiplied by 100,000 in order to get rid of 
decimal zeros.) 

The smallness and the approximate constancy of the second diffe- 
rences in the above example indicate the smoothness of the process 
of change of the function and the absence of random “splashes” in 
the process. Such a smoothness may be manifested in differences 
of higher order and it always indicates the “regularity” of the change 
of a function. In case the step is not small or the values of the argu- 
ment are close to the points of discontinuity etc. the differences 
may not be small but, as a rule, a certain kind of regularity in their 
values can be noticed. At the same time random errors occurring 
in the table greatly influence differences of higher order and this 
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usually enables us to find the errors. Suppose that by mistake we 
wrote 1.01294 instead of 1.01284 in the table. Then the fourth line 
would look as —4. +6, —25, +7, —4, —5 (check it!) and the regu- 
larity would obviously be broken. That is why differences of an 
order higher than the second are rarely used when a table represents 
results of an experiment which was not carried out with high pre- 
cision. In such cases one often restricts oneself to the first differences. 
The difference p fe+1 — Ilk is sometimes attributed not to the 

value Xk, as above, but to the value x h -f- which is naturally 

denoted as a; \ . Then the difference is called central and is desig- 
ft+— 

nated by 8 f y = i// l+1 — i/i t . Dividing the difference by the step 

k-r-jT 

we obtain a divided central difference. The central differences of 
the second order 8£y = + i y — 8^ i y are formed similarly. They 

are again attributed to the “integer” 
values of the argument etc. (of 
course, it is not the values x h of 
the argument x that are integers but 
tbeir numbers k; the same is with the 
“half-integer” values x i etc.). 

ft +T 

It is seen in Fig. 138 that the value 
of the divided central difference which 
is equal to the slope of the chord BC 
is closer to the derivative (i.e. to the 
slope of the tangent at the point A) 
than the “simple” divided difference Fig. 138 

(which is equal to the slope of the 

chord AD). This assertion can be easily verified by Taylor’s series 
(IV.52) since the difference 



_ Ui^ + h) — u (x) 
h J ~ h 





is of the order of h for small | h | whereas the difference 

&4 , 


~t~y = 




is of the order of h 2 (check it!). (The estimation of orders of accuracy 
of other approximate formulas for a small step is carried out simi- 
larly.) Thus, it is better to compute the approximate values of 
a derivative by means of a divided central difference than by for- 
mula (24). A more precise method of approximate calculation of 
derivatives of any order will be given in Sec. 9. 


13 * 
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Divided central differences for a small step are close to the cor- 
responding derivatives and resemble them in many respects whereas 
the differences themselves (not divided) are close to the corresponding 
differentials. For instance, formula (25) implies 

p- A iy=y”-T<z 

where | a ] 1, i.e. a is an infinitesimal as h -h>- 0. It follows that 

A ly = y"h 2 -f ah 2 = y" (Ax) 2 -f ah 2 = 

= d 2 y + ah 2 ( | ah 2 | < h 2 ) 

Hence, in case y" 0 the value of A \y differs from that of d 2 y by 
an infinitesimal of higher order and A 2 y and d 2 y are therefore equi- 
valent infinitesimals as h — 0 (see Sec. III. 8). 

The theory of finite differences was developed simultaneously 
with other basic branches of mathematical analysis. The first syste- 
matic representation of calculus of finite differences was given by 
Taylor in 1715. Finite differences are nowadays widely used in many 
theoretical investigations and in practical applications especially 
in connection with modern electronic computers. 

8. Newton’s Interpolation Formulas. If the distance h between 
neighbouring values of x for which a function / is given is constant 
we can use some formulas that are more convenient than formula 
(23). For example, suppose we know the values 

/ (*o) = I/o> / fa) = y t , f (x 2 ) = y 2 , / (x 3 ) = y 3 

where Xy = x 0 -f- h, x 2 = x 0 -f 27 j and x 3 = x 0 + 3 h. Then the 
polynomial P (x) taking the same values for the appointed values 
of x will be of the third degree (see Sec. 6). Newton's idea was to 
seek P (x) as a polynomial of the form 

P (x) = A + Bs + Cs (s — h) -j- Ds ( s — h) (s — 2 h) (26) 

where s — x — Xq. According to the condition there must be 

Uo ~ P ( x o) = P |*=o = A, y t = P (xj) = P | s=; , = A -f Bh, 
y« = P (X 2 ) = P | s=2 h = A + B -2h + C -2h 2 , 
y 3 = P (x 3 ) = P | s=3h = A + B -3 h + C -3 -2h 2 + D -3-2 /i 3 

Writing differences (see Sec. 7) for the left-hand and right-hand 
sides we obtain 

Ay 0 = Bh, Ay t = Bh + C -2h 2 , Ay 2 = Bh -f C -2 -2h 2 + 

+ D-3-2/i 3 

Forming differences a second and a third time we get 
A 2 y 0 = C -2/r, A 2 i/, = C -2h 2 + D- 3 -2 h 2 , A 3 y 0 = D -3 -2 h 2 
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From this we find A =y 0 , B = , C — auc ^ ® 3! A 3 ' 

Substituting these values in (26) and taking into account that we 
could start from any tabular value x h in place of x 0 we derive New- 
ton’s formula 

f(x) « P (i) = !/ t + As» A + + 

+Trx(r- 1 )(i- 2 ) (27) 

where s = x — X/,. 

Formulas for interpolation polynomials of other degrees are of 
a similar form. Increasing this degree we can pass to the limit as 
it Avas done in Sec. IV. 16 and thus obtain an infinite series of the 
form 

/W-d+ini+Tlt 

(28) 

The terms that are not put down in formula (28) contain differences 
of the fourth, fifth etc. orders and are therefore infinitesimals of 
the fourth, fifth etc. orders relative to the step h. This formula is, 
of course, truncated in practical calculations. The number of retai- 
ned terms must be chosen in such a way that the dropped terms 
should be negligibly small. It may turn out that it is impossible 
to attain this in case the step is too large or if we consider points 
lying near the end-point of the interval. In such circumstances for- 
mula (28) is inapplicable. 

Newton’s formulas (27) and (28) are easy to use when a function / 
is represented in a tabular way since in such a case its differences 
are calculated quite simply. They are especially often applied in 
the beginning of a table (when, for example, we take k = 0, i.e. 
the first tabular value x 0 . and x 0 < x < aq). We choose the degree 
of an interpolation polynomial P (x) considering the values of the 
differences. For example, if the third differences are very small then 
the last term in formula (27) is also small and can be dropped, that 
is we can restrict ourseWes to a polynomial of the second degree. 
In formula (28) it is also possible to put A- = 0, x < x 0 if | x — £ 0 | 
is not large; this will result in a backward extrapolation of the table. 

Newton’s another formula 

/(*) = !/*♦.- + TT t ( f ~ 1 ) ~ 

where ( = a:/ i+ j — x can be deduced like (28). This formula is used, 
in particular, at the end of a table, for example, if x ft+1 is the last 


A 3 ;/* s 


0(t- 2 ) + 
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tabular value of the argument and Xu < x < xi t+1 . This formula 
is also used for a forward extrapolation of a table. 

When interpolation is carried out in the middle of a table it is 
desirable to have a formula that takes into account tabular values 
lying on the left and on the right of the value x we are interested in. 
One of such formulas is Bessel's formula which is obtained by taking 
the half-sum of the right-hand sides of (28) and (29): 

ni)= n±^L +Ayk (i_i) (t-‘) + 

+ < 30 > 

where s — x — £*. This formula provides a high accuracy; it was 
named after F. W. Bessel (1784^1846), • a German astronomer, but 
it was in fact established by Newton. 

Interpolation formulas are also used for the so-called problem 
of inverse interpolation which consists in finding the values of the 
argument for given values of a function. Let us begin, for example, 
with formula (27). Regarding this equality as an exact one we can 
solve it for the second summand in the right-hand side which, after 
division by Ay*, yields 

S _ U - .'/* A 2 i/ft 5 I s . \ A *y k s ( s . \ / s ? \ 

h Ay* 2' A y h h \h l ) 3! Ay* h \h 1 \h * ) 

If y is given then in order to find s we can apply the iterative 
method (see Sec. 3). To do this we can choose ( ) 0 — as the 

zeroth approximation. Substituting this value into the right-hand 
side of (31) we obtain { an ^ so forth. The process converges 
well for small h. 

It should be taken into account that when we interpolate a dis- 
continuous function or a function with a discontinuous derivative 
the accuracy of an approximation may decrease considerably near 
the points of discontinuity since the interpolation polynomial itself 
has no discontinuities. A discontinuity may be imitated by bringing 
nodes of interpolation very close to each other but it is usually 
preferable to carry out an interpolation process only on intervals 
lying between the points of discontinuity. 

9. Numerical Differentiation. Numerical differentiation is usually 
applied when a function whose derivative must be found is defined 
by a table. This can be carried out as follows: the function is replaced 
by a polynomial according to the methods of Secs. 6 and 8 and then 
- the derivatives of the polynomial are taken as the approximations 
to the derivatives of the original function. For example, formula (27) 
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implies (check it!) 

„ , \ Ai/ ft , A -i/k I s 1 \ . A 2 y h [ / s \ 2 9 ( JL) _i_ — 

/ (*)«— + “7T Ll/j UJ 1 3.1 

{Taking formula (28) with a greater number of terms we can get 
a more & accurate result.] In particular, putting x = x h (i.e. s -= 0) 
we derive 

/' (*ft) ( A ^ — ^F + ^) 

A inore precise formula may be represented as an infinite series 
of the form 

fW = l(T-T+T-T+-) < 32 > 

In like manner we can deduce formulas for the derivatives of the 
second and subsequent orders. Formulas (29) and (30) (and other 
interpolation formulas) may also be used for these purposes. 

Let us, in particular, put down the following formula implied 
by formula (30): 

f (xk) = {(A !//,_(.+ Ai/fe) — -jr(A 3 f// f _2 + A 3 yfc_ 1 )-f 

+ i(A^_ 3 + A^_ 2 )-...j. 

The subsequent terms of the last formula are, respectively, of the 
first, third, fifth etc. orders, i.e. the coefficients decrease and the 
orders of infinitesimals increase here faster than those of the terms 
of series (32). 

If a table represents some results of an experiment then even 
a small error in a value of a function may lead (after the division 
by a small step) to finite or even a large error in computing a value 
of the derivative. The situation is getting still worse when we cal- 
culate derivatives of higher order. It is therefore desirable that the 
step of a table should be considerably greater (e.g. 10 times greater) 
than the possible error in determining the values of the function. 
The step should be still greater (e.g. 100 times greater) than the 
error when we calculate the second derivative and so on. As a result 
of these difficulties it is usually preferable to use other (empirical) 
formulas (compare with Sec. 1.30) instead of interpolation formulas 
when Ave differentiate empirical functions. Since empirical formulas 
are constructed by taking into account all the peculiarities of expe- 
rimental data they are considerably more stable with respect to 
random errors of an experiment. 

Interpolation formulas and formulas for numerical differentiation' 
are treated in courses on approximate calculations. 
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Determinants and Systems 
of Linear 

Algebraic Equations 


§ 1. Determinants 

j. Definition. The concept of a determinant arises when we inve- 
stigate systems of algebraic equations of the first degree. Let us 
first take the system of equations 

aix -f- b t y = 

a 2 x + b 2 y = dj ' 

in two unknowns x and y. Solving the system (we leave the calcu- 
lations to the reader) we get the answer 

djbj — b[d 2 — ^l g 2 /0\ 

X a\b 2 — bia 2 ' * aib 2 — b\a 2 ^ ' 

The expression a t b 2 — fqa, is called the determinant of the second 
order. It is designated by the symbol , : 


j Cl 2 0 2 J 
a i b i 

aj}2 — bia.2— , 

(In C?2 

This notation enables us to rewrite formulas (2) in the form 


dt 

h 

*1 

di 

\d 2 

b 2 

flo 

d 2 

«1 


y a , 

bi 

a 2 

b 2 

o 2 

b 2 


Let us consider an example of computing a determinant: 
j 0 — 3 

=s0 .l-(-3)2 = 0 + 6-6 


The same process applied to solving the system of equations 
aix -J- b t y 4- qz = c/j "I 

a 2 x 4- b 2 y 4 - c 2 z = c/ 2 > (‘ 

a s x + b 3 y 4- c 3 z = d 3 j 
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yields the fractions which have the denominator of the form 

aib 2 c s — fli Cnb 3 — bifla c 3 4- ^i c 2 c 3 4" CiOnb 3 C\bM z (6) 

The last expression is called the determinant of the third order and 
is designated as 


ai bi ci 
(In bn C 2 
o 3 b 3 C 3 

Transforming expression (6) and bearing 
derive the formula 


(?) 


in mind notation (3) we 


U i Cj 

a 2 b 3 Cl 


— d 1 (b 3 C 3 — C 3 b 3 ) bi (^2^3 C 3 0 3 ) T 


"f" C 1 ( a 2&3 — bnd-j) — Cl I 


-b. 


0-2 c 2 
«3 C 3 


-(-Cj 


O 2 b% 

On bn 


62 c 2 

b 3 c 2 

which is of use for calculating a determinant. For instance, 
1 0 -2 

3 1 2 

= 1 (l-2 — — -l) — 2 ( — M — 1'3) = ~t8 = 9 


( 8 ) 



1 JL 



+ ( — 2) 

— 1 1 

= 1 

2 

1 2 

-0 

w 

2 

3 2 

3 1 


Determinants of the fourth order are defined by analogy with 
formula ( 8 ): 


a i b 1 

o 3 b 3 

O3 b 3 

U 4 64 



b 2 

c z 

(In 


02 

C2 

d z 

= «i 

b 3 

C 3 

d 3 

~bt 

o 3 

C 3 

d 3 


bi 

Ci 

di 


«4 

Ci 

di 



0 3 


^2 


o z 

bn 

c 2 

Cl 

O3 

^3 

d 3 

-di 

o 3 

b 3 

c 3 


a A 

bi 

di 

| 

Ol 

bi 

Ci 


(we suggest that the reader should carefully think over the structure 
of the expression entering in the right-hand side). Determinants of 
the fifth, sixth etc. order (and, generally, determinants of the zzth 
order) are introduced in a similar way. 

2 . Properties. We are going to describe the properties of determi- 
nants but- we shall take only the determinants of the third order 
of form (7) although all these properties hold for determinants of 
any order. 
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A determinant of the third order of form (7) has three rows and 
three columns. It consists of nine elements (i.e. of the numbers 
n i , bj, . . . , c 3 ). 

1. Interchanging two rows or two columns of a determinant is 
equivalent to multiplying the determinant by — 1. For example, 


«i 

h, 

C\ 


c 1 

hi a t 

«2 

b 2 

Ci 

= — 

Cl 

b 2 

a 2 


b 3 

C-\ 


C 3 

b 3 

d 3 


(we have interchanged the third and the first columns). This is 
proved by comparing both sides of (9) according to formula (8): 


c 1 

hi 

a t 


b 2 

a 2 


Co 

a 2 I 


Co 

b 2 

Cl 

b 2 

hi 

a i 

= — c t 

b 1 

«3 

4- hi 

C 2 

a 3 1 

— flij 

C 3 

b 3 

Cl 

a-- 











— — c i (^2 a 3 — a 2 b 3 ) hi {c 2 a 3 — a 2 c a) — a i ( c 2^s — h 2 c 3 ) 


If we remove the parentheses here we shall obtain the expression 
equal to (6). This proves formula (9). 

2. A determinant having two identical rows or columns is equal 
to zero. For example, 


«i 

h, 

Cl 

a 2 

h 2 

Cl 

«2 

^2 

Co 


(here the second row coincides with the third one). Virtually, if 
we interchange the two rows then by property 1 we get — P = P, 
i.e. P — 0. 

3. A common factor entering into all the elements of a row or of 
a column can be taken outside the determinant. For instance, 


a 1 

kbi 

Cl 


a i 

hi 

c 1 

«2 

kb 2 

c 2 

= k 

a i 

bz 

Cl 

% 

kbi 

C 3 


a 3 

bz 

Cl 


The proof is carried out by verifying the equality. 

4. A determinant having a row or a column consisting of zeros 
equals zero. In order to prove this assertion it is sufficient to put 
k = 0 in formula (10). 

5. If each element of a row or of a column (for instance, of the 
second row) can be represented in the form of a sum of two terms 
the determinant itself can be represented as a sum of two determi- 
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nants according to the formula 


«1 

bt 

C\ 


a i 

bi 

Cl 


a x 

bi c, 

+" 

62 + 6- 


= 

/ 

a 2 

K 

l 

C 2 

+ 

a„ 

b"„ cl 

*3 

63 

C3 


a 3 

b 3 

^3 


«3 

b z c 3 


The proof is carried out by verifying the equality of both sides. 

6. Adding arbitrary numbers proportional to the elements of 
a row (a column) to the corresponding element of another row (co- 
lumn) of a determinant we do not change the numerical value of 
the determinant. Indeed, for instance, 


a, -j- kci bi Ci 


0-1 


kci bi Ci 

&2 -f- kc 3 bn Co 

— 

a 2 bz c 2 

+ 

ken b 3 c 2 

a 3 -\-kcz bj c 3 


a 3 b 3 c 3 


kc 3 bo c 3 



ai 

bi 

Cl 


Cl 

hi 

Cl 


dl 

bi 

Cl 

= 

do 

62 

c z 

+ k 

c z 

b z 

Cz 


df) 

bz 

Cz 


a 3 

bz 

c 3 


C3 

63 

Cz 


«3 

b 3 

C3 


(in the calculations we have applied, in succession, properties 5, 3 
and 2). 

7. The value of a determinant does not change if each of its rows 
is replaced by the corresponding columns and vice versa, that is 


a l 

bi 

Cl 


«1 

do 

a 3 

«2 

bz 

Cz 

= 

bi 

bz 

bz 

«3 

63 

C 3 


Cl 

Cz 

c 3 


(this is the operation of transposing a determinant). 1 The proof is 
carried out by verifying the equality. 

3. Expanding a Determinant in Minors of Its Row or Column. 
First we introduce the notion of a cofactor of an element of a deter- 
minant. Suppose we choose an element of a determinant and then 
delete the row and the column to which the element belongs. Thus 
we get a determinant of lower order which is called the minor 
(minor determinant) of the element of a determinant. Now let us 
supply each minor with the sign -f or — depending on the position 
the corresponding element occupies in the original determinant 
according to the following rule: we take + for the minor of the 
element standing in the left upper corner of a determinant (this 
element is sometimes called the origin of the determinant) and 
alternate the signs of other elements in chess-board order according 
to the scheme 

+ - + 

- + - 

+ - + 



204 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


The quantities thus obtained are called cofactors (or signed minors 
or algebraic adjuncts) of the elements of a determinant. For instance, 
the cofactor of the element a t [see determinant (7)] is equal to 


b z c 2 

b'i Co 


the cofactor C 2 of the element c 2 is equal to — 


etc. 


, a 3 

formulated 


There is a general property of determinants which is 
as follows: a determinant is equal to the sum of the products of the 
elements of any row or column of the determinant by their cofactors. 
For example, 


oi b, 
a 2 b 2 

«3 b 3 


Cl 

c i 

c 3 


— biB t -r 62^2 ~r b 2 B 3 — 


- -b x 


do c% 


a { c t 


a \ Cl 

+ bo 


— b 3 


a 3 c 3 


&3 £3 


CLo Co 


This representation of a determinant is called the expansion of the 
determinant in minors of its row or column (we have the expansion 
in minors of the second column in the example). We also say that 
the determinant is expanded in terms of the elements of its row or 
column. As in Sec. 2, the proof is based on verifying the equality. 

The properties enumerated in Secs. 2 and 3 are applied to eva- 
luating determinants. For example, let us compute the determinant 


£> = 


1 0 
3 1 
2 1 
0 3 


2 

0 

-1 

2 


-1 

-1 

0 

1 


Here we can apply property 6 of Sec. 2 and make all the elements 
of a certain row or column but one be zero. Then expanding 
the determinant in minors of this row or column we get only one 
nonzero summand because all other minors are multiplied by zeros. 
If, for instance, we want only the element occupying the second 
place in the third row of the determinant to be unequal to zero we 
should multiply the second column by — 2, add it to the first column 
and after this add the second column of the resulting determinant 
to its third column. Then we obtain 


1 

0 

2 

-1 


1 

0 

2 

— 1 

l 

1 

U 

— 1 


1 

1 

1 

-1 

0 

1 

— 1 

0 


0 

1 

0 

0 

— 6 

3 

2 

1 


-6 

3 

5 

i 
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(the two operations are usually carried out simultaneously, by “one 
step'"). Now expanding in minors of the third row we obtain 

1 2 -1 

£>=—!• 1 1 -1 

— 6 5 1 

Here we can. for example, subtract the second row from the first 
one which results in 

0 1 0 

D= — 1 1 — 1 

— 6 5 1 

Now if we expand D in minors of the first row we finally obtain 

D— — ( — 1) =M-(-l)(-6)= -fi- 

In case the elements of a determinant are approximate numbers 
the method is used in a modified form. We shall illustrate this tech- 
nique by evaluating the determinant 

-1.37 2.15 0.76 

D= 2.31 -1.78 -4.32 

-0.86 2.13 3.25 

Let us factor out the first element of the first row using the rule of 
a reserve decimal digit (gee Sec. 1.9): 


1 — 1.569 — 0.555 

£>=-1.37 2.31 — 1.7S —4.32 

— 0.86 2.13 3.25 

Multiply the first row by —2.31 and add it to the second row and, 
simultaneously, multiply the first row by 0.86 and add it to the 
third row: 


1 

— 1.569 

— 0.555 

, 1-844 

= —1.3/ 


0 

1.844 

-3.038 

-3.03S 

0 

0.781 

2.773 

0.781 

2.773 


Let us repeat this procedure: 


D— — 1.37 x 1.844 1 

0.781 


— 1.647 

2. lid 


= 1.37x1.844 1 

0 


- 1.647 

4 059 = 1.844x4.059= — 10.3 
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This method is also applicable to determinants of higher order. 
It cannot be used in case some of the determinants occurring in the 
process of calculating contain the number 0 (or a number which 
is close to zero and known with a low degree of relative accuracy) 
as its left uppermost element. To overcome the difficulty we can 
slightly change the method and begin not with the left uppermost 
element but with the element which is the greatest in its absolute 
value. This element (the principal element) should he factored out 
of the row (column) in which it is contained. (The element —4.32 
is the principal element in our previous example.) 

The fundamentals of the theory of determinants were introduced 
in 1750 by the Swiss mathematician G. Cramer (1704-1752). 


§ 2. Systems of Linear Algebraic Equations 


4. Basic Case. We shall limit ourselves to such systems in which 
the number of equations coincides -with the number of the unknowns. 
We shall deal only with the systems of three equations, i.e. systems 
of type (5). For example, if we want to find y we should multiply 
the first of equations (5) by the cofactor J5, of the element b l of 
determinant (7), multiply the second equation by B 2 and the third 
one by B 3 and then sum together the results. Doing this we derive 


(a l B l + a 2 B 2 -f- a 3 B 3 ) x -f- [b l B l + b 2 B 2 + b 3 B 3 ) y -f- 
-f- (cj7?i -j- CoB 2 T~ c 3 B 3 ) z — diBi -{ - dnB 2 -f* d 3 B 3 
But the expression inside the first parentheses is equal to 

flj Cj * 

a Z a 2 c 2 

a 3 a 3 C 3 


( 11 ) 

( 12 ) 


In fact, if we expand the determinant in minors of the second column 
we obtain the sum of the products of the elements a y , a 2 and a 3 
by their cofactors, these cofactors in determinant (12) being equal 
to the cofactors of the elements b t , b 2 and b 3 in determinant (7), 
i.e. equal to B v B 2 and B 3 , respectively. Determinant (12), by pro- 
perty 2 in Sec. 2, is therefore equal to zero. By the same reason, 
the expression in the third parentheses of formula (11) also equals 
zero whereas the right-hand side of (11) is equal to 

«i d { Ci 

(l 2 d 2 Cn 

a 3 d 3 c 3 

The expression in the second parentheses in (11) is just equal to 
determinant (7) itself by Sec. 3. This determinant consists of the 
coefficients in unknown quantities in system (5) and is called the 
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determinant of the system; we shall denote it, for brevity, by the 
letter D. Thus, from (11) we deduce 




In the same way we find 


a i 

fl 2 

a 3 


d t Ci 
do Co 
d 3 c 3 



di 

bi 

Cl 


a, 

bi 

d t 

Dx — 

d 2 

b 2 

Cl 

and Dz — 

do 

b z 

d<z 


d 3 

b 3 

C 3 


a 3 

^3 

dc{ 


(13) 


(14) 


Now let us first suppose that D =j= 0, this is the basic case. As- 
it will he shown in the end of Sec. X.7, in this case the system in 
question has a unique solution. From (13) and (14) we derive the 
formulas for the solution: 


di 

b l 

Ci 



di 

c l 


01 

h 

di 

d 2 

b 2 

c 2 


a 2 


c z 


«2 


d 2 

d 3 

b 3 

C 3 


a-3 

d 3 

_J3_ 


«3 


d 3 


D 




D 


1 ^ " 


D 



Thus, every unknown quantity is equal to a quotient which has the deter- 
minant of the system in the denominator , the numerator being the 
determinant which is obtained from the determinant of the system by 
substituting the column consisting of the right-hand members of system 
(5) for the column of the coefficients in that unknown. 

Let us take, for example, the system of equations 


x — 2z — 1 
2x + y — z = 0 > 

x — 2 y + z = — 2 

Suppose it is necessary to find the value of z. Then 


1 

2 

1 

0 

t 

— 2 

1 

0 

— 2 

1 (-2 + 0) — 0+1 (-4—1) 

1 

0 

— 2 

1 (1 — 2) — 0 — 2( — 4 — 1) 

2 

t 

— 1 


1 

— 2 

1 



In the same way we compute the remaining unknowns and it is 
only the numerators that should be evaluated since the denomina- 
tors equal D — 9. 

The above formulas are called Cramer’s rule. If we apply the 
formulas to system (1) of two equations in two unknowns we obtain 
formulas (4). 
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5. Numerical Solution. The application of the formulas given 
in Sec. 4 is inconvenient in case the coefficients of equations are 
approximate numbers and, in particular, if the number of equations 
is large. 

There are many methods that can be used in such cases. Here we 
shall represent two of them. We again take system (5) as an example 
but the methods are applicable to systems with any arbitrary num- 
ber of unknowns. 

Gauss’ method (the method of elimination) named after K. F. Gauss 
a famous German mathematician (1777-1855), who also obtained 
fundamental results in astronomy and geodesy, is analogous to 
the method of evaluating determinants in the end of Sec. 3. 

We divide the first equation of system (5) by which results in 

x + b[y -f c[z = d[ (15) 


(the primes designate the new coefficients here). Multiply equation 
(15) by — a 2 ( — a 3 ) and add it to the second (third) equation of the 
system. This eliminates x and yields the system 


K l J + c' 2 z = d' z | 

Ky + 4 Z = d s 1 

Now we divide the first of equations (16) by b' and obtain 


(16) 


y -f- c'j. — dl 


(17) 


Multiplying the last equation by — h 3 and adding it to the second 
of equations (16) we derive the equation of the form 


V = d l (18) 

that is y is also eliminated. From (18) we find z; substituting it 
into (17) we determine y; then substituting y and z into (15) we 
finally evaluate x. 

As in Sec. 3, if at some stage of the calculations the left uppermost 
coefficient turns out to be considerably smaller than others in its 
absolute value the method should be modified, that is we should 
eliminate the unknown whose coefficient is the greatest in the abso- 
lute value by dividing the corresponding equation by the coefficient. 

The iterative method (compare with Sec. V.3) is applied to system 
(5) in the following way. The system is rewritten in the form 


x — a. pc + -f y,z + 6, 1 

y — a 2 x + y„z + 6, l (19) 

z = a 3 x + + y 3 z + 6 3 J . 


Then we choose certain values x = x 0 , y = y 0 and z — z 0 as the 
zeroth approximation. The values are substituted for x, y and z, 
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respectively, into the right-hand sides of system (19) and thus the 
first approximation x lt i/j and Zj is obtained. Substituting again 
x u i/j and z 4 into the right-hand sides of (19) we get the second appro- 
ximation and so forth. Generally, the (n -f- l)th approximation is 
expressed in terms of the nth approximation by the formulas 


2-n+l ~ CliZn “h PlPn "V Ti^ti "I - 

i/n +1 = p 2 ?/n "h "1“ ^2 r (20) 

z 7!+i — a 3 X n -j- P 3Z/rx “t - T" 

If the process converges, that is the successive approximations 
have limits as n-v oo (x„ x , y n -v y and z n ^-z), then_ passing 
to the limit in formulas (20) as n ->• oo we see that x = x, y — y 
and z = z form the solution of system (19). 

Just as in the examples given in Sec. V.3, the smaller the absolute 
values of the coefficients in the unknowns in the right-hand sides 
of the system, the faster the convergence of the iterative method for 
system (19). Some more precise tests for the convergence will be 
given in Sec. XVII. 18. 

System (5) is solvable in the simplest way in case it has the tri- 
angular form: 


a^x = dp | 

a 2 x + b 2 y — d A 

a 3 x + b 3 y + c 3 z = d 3 j 

In this case we immediately find x from the first equation of the 
system; substituting it into the second equation we obtain y and 
then determine z after substituting x and y into the third equation. 
Gauss’ method described above is essentially the method of reducing 
general system (5) to the triangular form (15), (17) and (18). 

Different useful rules for numerical solution of systems of linear 
algebraic equations can be found in [3], [10] and [13]. 

6. Singular Case. If the determinant of a system is equal to zerP 
we shall say that there is a singular case. As it will be shown in 
Sec. X.7, in such a case there may be one of the following two pos- 
sibilities. 

1. The system is inconsistent (that is contradictory) which means 
that it has no solution. For example, the system 

x + 2y — z = U 
2x — y = 3 l 
3a; -|- y — z — 5 J 

is just of this type since adding together the first two equations we 
arrive at a contradiction with the third one. 


14-0141 



CHAPTER VII 


Vectors 


§ 1. Linear Operations on Vectors 

1. Scalar and Vector Quantities. Scalar and vector quantities 
differ in the following aspect: whilst the former are completely 
characterized by their numerical values relative to a chosen system 
of measurement units (such quantities as temperature, work, den- 
sity etc.), the latter have, in addition, certain direction in space 
(such quantities as force, velocity etc.). All the quantities we studied 
before were scalars and it was therefore permissible not to use the 
word “scalar”. But when we consider both scalar and vector quan- 
tities it is essentially important to take into account the nature 
of the quantities. A vector quantity, or, simply, a vector, can be 
represented by a line segment in space if we choose a certain unit 
of length. For instance, if we intend to represent forces we can assume 
that the segment of 2 cm represents the force of 1 kg and the like 
(see Sec. 1.4 on this question). The line segment representing a vector 
is directed (oriented), that is its origin and its terminus must be 
indicated. The direction of a vector is usually indicated by an 
arrow. Diverting a vector from its direction we obtain the absolute 
value (modulus) of this quantity. Thus, the absolute value of a vec- 
tor is a scalar which has a dimension of the quantity in question 
and is always positive with the only exception for the zero vector 
(see Sec. 3). The absolute value of a vector is also called its length. 
For example, let us take a vector representing a force. When we 
speak about the vector we mean that its length has the dimension 
of force, that is the measurement of the length of the representing 
line segment in the chosen scale units yields the magnitude of the 
force. In mathematics vectors are usually regarded as dimensionless: 
The absolute value of such a vector is a dimensionless (abstract) 
number. Vectors are usually given in bold face or designated by 
arrows (see Fig. 139). The absolute value of a vector is denoted by 
the same letter but in ordinary print or in bold face with vertical 
bars (the sign of modulus): AB — a = | a |. 
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Thus, to define a vector means to define its absolute value and 
its direction in space. Accordingly, every vector “n be translated 
(that is it can be transferred m such a way that its direction m 
space should remain parallel to the original directm^toanyplace. 
Hence, the origin of a vector (“point of application ) can be any 
where In other words, two vectors are regarded as being equal if they 
have the same absolute values , are parallel and similarly directed 

^Thf freedom of translating a vector is sometimes restricted. For 
instance, the origin of a certain vector can be fixed. Such vectors 
are called localized or bound vectors (a radius-vector mentioned m 




Fig. 139 


Fig. 140 

a— b 


Sec. 9 presents an example of such a vector). Further, there can 
exist a certain straight line in which a vector must lie. Then the 
vector is said to be a sliding vector. The vector of angular velocity 
of a rotary motion which lies on the axis of revolution is an example 
of such a vector. If the parallel translation of a vector is unrestricted 
the vector is called a free vector. 

2. Addition of Vectors. Linear operations on vectors are the ope- 
rations of adding vectors together and of multiplying a vector by 
a scalar (and the operation of subtraction which is, of course, con- 
nected with the addition). Generally, a quantity can be considered 
to be a vector if and only if the above operations are performed in 
accordance with the rules described further in Secs. 2-4. 

The addition of two vectors is performed according to the well- 
know parallelogram law in mechanics which is the rule of adding 
forces and velocities. For example, if it is required to add together 
two vectors a and b they are applied to a common origin and then 
a parallelogram is constructed on them (see Fig. 141). A vector 
coinciding with the diagonal of the parallelogram whose origin 
is that of a and b is, by definition, the sum a + b. This construc- 
tion straightway implies that a -f- b=b + a, that is the commu- 
tative law holds for the addition of vectors. 

The opposite sides of a parallelogram being parallel and equal, 

the vector DC in Fig. 141 is also equal to b. This implies one more 
rule of adding vectors: the origin of the second vector is placed 
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at the terminus of the first vector. Then a vector closing the triangle, 
that is the vector whose origin is that of the first vector and whose 
terminus is that of the second vector, is the sum of the vectors 
(Fig. 142). If now it is necessary to add a third vector to the above 
sum we must put the origin of the third vector at the terminus of 



Fig. 141 

c=a-fb 


h 



the second one and take the closing vector again etc. The general 
rule of adding together any number of vectors is illustrated in 
Fig. 143. It follows from Fig. 144 that the associative law also holds 
for a sum of vectors: a + (b + c ) = ( a + b) + c. The commutative 
and associative laws imply that the order of summation and the 
way of bracketing do not affect a sum of any number of summands. 
For example, 

(a + b) (c + d) = [(b + d) -|- c] + a = 

= Ic — {— (h — d)l -f- b 

and the like. 

We must underline that we cannot add together vectors of diffe- 
rent dimensions and that it is impossible to add together a vector 




and a scalar. Besides, we shall not consider the comparison of vec- 
tors in our course. This means that there will be no positive and 
negative vectors or inequalities of the form a > b etc. But of course 
we can compare absolute values (lengths) of vectors. At the same 
time we must not be surprised that the absolute value of a sum of 
vectors may happen to be, for example, less than the absolute value 
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of each summand. Actually, vectors are added not as numbers 
but .according to the parallelogram law of forces, and the resultant 
of a system of forces can be smaller than each of the forces. 

We conclude by pointing out that there is a consequence of 
Fig. 143, namely, 

|a-j-b-f-c-j-d|^|a|-f- I b | + i c I + I d | 

The equality will be here only if all the vectorial addends are of 
the same direction; in such a case the polygon of vectors degenerates 
into a straight line. 

3. Zero Vector and Subtraction of Vectors. A vector whose ter- 
minus coincides with its origin is called the zero vector or, simply, 
zero. Its absolute value is equal to zero whereas all the other vec- 
tors have positive absolute values. The direction of this vector is 




undetermined, i.e. any direction may be ascribed to it. We can 
therefore regard the zero vector as parallel (perpendicular) to any 
vector. It is denoted as 0 and its role in an operation of adding 
vectors is similar to that of the number zero in adding numbers. 
In fact, it is apparent that a + 0 = a. 

Let a vector a — AB be given. Then the vector BA is called the 
negative of the vector a and is denoted as —a (see Fig. 145). Obvi- 
ously, a + (—a) = 0. 

To Subtract a vector means to add its negative. It follows that 
b + (a — b) = I) + [a + ( — b)l = a -f [b + (— b)] = 
a + 0 = a 

This corresponds to the usual definition of a difference. The geo- 
metrical interpretation of the rule of subtraction is shown in 
Fig. 146. 

4. Multiplying a Vector by a Scalar. The product Aa = a/l of 
a vector a by a dimensionless scalar (number) A is defined as follows: 
if A, >0 then the product is a vector which is obtained from a as 
it is stretched A-fold without changing its direction; if A < 0 then 
a must be stretched | A | -fold and its direction must be replaced 
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by the opposite direction (see Fig. 147). Further, y = (yj a. These 

definitions imply the following simple properties: 

1. ( — 1) a = — a; 

2. Oa = 0; 

3. AO = 0; 

4. (P. + p) a = Xa -f pa; 

5. X (a 4~ b) " Aa -f- P.b; 

6. X (pa) = (P.p) a; 

7. Xj- = a; 

8. If n is a positive integer then na = a + a +. . . + a. 

\ y 

n times 

These properties enable us to perform linear operations on vectors 
and transform algebraic expressions containing vectors in the same 
way as it is done with numbers. The properties are proved in an 
obvious w r ay. For instance, Fig. 148 demonstrates the proof of pro- 
perty 5 [Pm -f- Ab = P.c = X (a -f- b)l for X >0. 

If a scalar X has a dimension the product Pia is defined as a vector 
whose absolute value is equal to | X | | a | and which is parallel 
to a and directed like a if X >0 and directed contrary to a if X < 0. 



Fig. 118 Fig. 149 

c=a- b, }c—}a-T?.l> c = ?.a ~ fib 


All the enumerated properties remain true for this case as well. 
Further, for the sake of simplicity, we shall regard all vectors and 
scalars as dimensionless unless otherwise stated. 

5. Linear Combination of Vectors. Suppose we have several vec- 
tors, for example, three vectors a, b and c. Then every vector of the 
form d = P.a -f- pb -f- vc where X, p and v are scalars is called 
a linear combination of the vectors a, b and c. The vector d is also 
said to be expressed linearly in terms of a, b and c which means that 
d can be obtained by linear operations on a, b and c. As examples 
of such linear combinations we can take 

a-f2b — 3c, a-i-0b-f y c, 0 = 0-a-f O-b + 0-c etc. 
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A given system of vectors is called linearly dependent if one of the- 
vectors is expressed linearly in terms of the rest. If otherwise the 
vectors are called linearly independent. 

Two vectors are linearly dependent if and only if they are parallel 
to each other. Certainly, it follows from the definition (see Sec. 4) 
that b = Aa implies b || a. Conversely, if two vectors are parallel 
then they are linearly dependent because we can always take such 
a coefficient of extension X that after increasing the length of one 
of the vectors X times the extended vector should coincide with the 
other vector (in case the parallel vectors are of opposite directions 
the coefficient X is negative). 

Three vectors are linearly dependent if and only if they are parallel 
to a common plane. Virtually, let c = X& -f- Let us translate the 
three vectors so that their origins coincide. Draw a plane through 
the vectors a and b (the plane P in Fig. 149). Then the vectors ^a- 
and pb will lie in the plane P and therefore their sum, that is c, 
will also lie in the same plane. Consequently, the vectors a, b and c 
were parallel to the plane P in their original positions. Conversely, 
let it be known that the vectors a, b and c are parallel to a common 
plane P. Fig. 149 shows the representation of c in the form of linear 
combination of a and b in case a 'jj. b. Such a representation is called 
the resolution of a vector in a plane into components with respect to two- 
given non-parallel vectors. As for the case a || b, the preceding para- 
graph implies that one of the two vectors a and b (for instance, the- 
vector a) is expressed linearly in terms of the other vector (for in- 
stance, in terms of b). Therefore a, b and c are linearly dependent- 
because a is expressed in terms of b and c. 

Four or more vectors are always linearly dependent. Indeed, let 
us take four vectors a, b, c and d. Translate them to a common ori- 
gin. If after this the vectors a, b and c lie in a common plane then, 
in accord with the preceding paragraph, one of them is expressed 
linearly in terms of the rest etc. (as in the end of the preceding- 
paragraph). Let now a, b and c not lie in a common plane after they 
are translated to a common origin (see Fig. 150). Then we can draw 
a straight line parallel to the vector c and passing through the 
point D (the terminus of the vector d), the point C being the point 
of intersection of the straight line with the plane in which the vec- 
tors a and b lie. Finally, we draw a straight line parallel to the vec- 
tor b and passing through the point C. This straight line intersects 
the straight line containing the vector a at a point B. Then we can. 
write 

d = AD = AB + BC + CD = Aa -j- pl> + vc (1) 

Such a representation is called the resolution of a vector into com- 
ponents with respect to three given vectors which are not parallel to 
one and the same plane. We also call it the resolution of a vector into 
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■components along three given axes (these axes are denoted as ll, mm 
and nn in Fig. 150). Representations of this type are utilized in 
theoretical mechanics and other branches of science for resolving 


n 



Fig. 150 

a — ?.a -f /ib -f- VC 


forces and other vector quantities into components along three 
given directions. Each of the summands ?*a, pb and vc is called a 


l 



Fig. 151 
a = a P + 


■component of the vector d along the corresponding direction. A com- 
ponent along an axis depends not only on the direction of the axis 
but also on the directions of other axes. At the same time a compo- 
nent does not depend on the choice of positive directions for given 
axes. 



VECTORS 


219 


Another type of representing a vector which is sometimes used 
is shown in Fig. 151. Here a given vector is resolved into compo- 
nents a p and a t where a p lies in a given plane and a ( is parallel 
to a given axis. 

Resolution (1) can be performed uniquely. Indeed, if another 
representation d = ^a + pjb + ViC existed we should equate their 
right-hand sides and deduce the relation (X — X t ) a + (p — pi) b + 
. (v — Vj) c = 0. This would imply that the vectors a, b and c 
would be linearly dependent (why is it so?). 

A system of linearly independent vectors used for resolving other 
vectors into components with respect to the system is called a basis. 
The foregoing discussion implies that any two non-parallel vectors 
can be taken as a basis in their plane and that any three vectors 
non-parallel to one and the same plane can he taken as a basis in 
space. If a, b and c form such a basis then the numbers X, p and v 
entering into representation (1) are called the coordinates of the 
vector d with respect to the basis a, b, c. If a basis a, b, c is given 
the coordinates X, p and v are uniquely determined by the vector d. 
Conversely, the vector d is uniquely determined by its coordinates 
X, p and v. 

§ 2. Scalar Product of Vectors 

6. Projection of Vector on Axis. Let a vector a = AB and an 
axis l be given (see Fig. 152). The projection proj; a of the vector a 
on the axis l is the length of the segment A'B' connecting the feet 


B 



of the perpendiculars drawn from the points A and B to the axis l. 
This length is taken with the sign + or — depending on whether 
the direction of the segment A'B' coincides with the positive direc- 
tion of the axis or is opposite to it. A projection of a vector on ano- 
ther vector is defined similarly. In this case the perpendiculars 
are drawn to the other vector or to its prolongation. Thus, a projec- 
tion of a vector is a scalar ; the dimension of a projection coincides 
with that of the projected vector. 
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components along three given axes (these axes are denoted as ll, mm 
and nn in Fig. 150). Representations of this type are utilized in 
theoretical mechanics and other branches of science for resolving 


n 



Fig. 150 

d = + H b + vc 


forces and other vector quantities into components along three 
given directions. Each of the summands /ta, pb and vc is called a 


L 



Fig. 151 

a = a p + a i 


■component of the vector d along the corresponding direction. A com- 
ponent along an axis depends not only on the direction of the axis 
but also on the directions of other axes. At the same time a compo- 
nent does not depend on the choice of positive directions for given 
axes. 
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formula (3) cannot be extended to more than two vectors. Bearing 
in mind formula (2) we can also put down the relation 

ab — b proji, a = a p'roj a b (4) 

Therefore, the scalar product of two vectors is equal to the product 
of the absolute value of one of the vectors by the projection of the other 
vector on the first. 

Example. Let s be a displacement vector of a material point and 
let F be a constant force (one of the forces acting upon the point 
in a process of motion) depicted in Fig. 155. To reckon the work A 
performed by this force we must take into account only the com- 
ponent F' of 'the force F along the direction of the displacement. 
Hence, A = s proj s F = s-F. 

The dimension of a scalar product is equal to the product of di- 
mensions of the factors. 




Formulas (3) and (4) are simplified if one of the factors (or both 
factors) is a unit vector, that is a vector with the dimensionless 
absolute value 1. For example, if ej, e 2 and e are unit vectors, 

cre 2 = cos (ej, e 2 ), ae = proj e a (5) 

A unit vector lying on an axis l is usually denoted as 1 °. Simi- 
larly, a unit vector parallel to a vector b and having the same direc- 
tion is denoted by b°. Taking into account formula (5) we can write 

proj ; a = pro j jo a = a-l°, proj b a = a b° 

8. Properties of Scalar Product. 

1 . A scalar product is equal to zero if and only if the vectors are 
orthogonal, that is perpendicular to each other: 

a-b = 0 is equivalent to a j_ b 

Virtually, this follows from (3) since cos (aPb) = 0 implies 
(a, b) = 90°. Of course, it may happen that a = 0 but this means 
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that a = 0 and a zero vector can be regarded as being perpendicular 
to any vector (see Sec. 3). 

2. a- a — a" because (a, a) = 0° and cos (a, a) = i. In other 
words, the scalar square of a vector equals the square of its abso- 
lute value. 

3. A scalar product does not depend on the order of its factors: 
a-b = b-a. [This immediately follows from formula (3).] 

4. A scalar factor may be taken outside the scalar product of 
vectors: 

(Xa)-b = a-(Xb) = ?. (a-b) (6) 

Really, on the basis of property 4 from Sec. 6, we have proj b (i.a) = 
= X proj b a- Multiplying both sides by b and taking into account 
formula (4) we derive b-(?.a) = X (a-b). The scalar product being 
commutative, the last relation implies formula (6). 

[We suggest that the reader should deduce formula (6) as a direct 
consequence of the definition of a scalar product.] 

5. Distributive law: 

(a + b)-c = a-c 4* b-c 

To prove this property we write proj c (a + b) == proj c a + proj c b 
on the basis of property 5 in Sec. 6. Then we multiply both sides 
of the last relation by c which yields the desired formula. 

The enumerated properties enable us to compute the scalar pro- 
duct of linear combinations of vectors. For instance, 

(a -f- 2b)-(2a — 3b) = 2a-a — 3a-b + 4b-a — 6b-b = 

- 2a 2 + a-b — 6b 2 


§ 3. Cartesian Coordinates in Space 

9. Cartesian Coordinates in Space. Let us take a triad of vectors 
a, b and c drawn from a common origin at a point 0 and not lying 
in one and the same plane. We choose the point O as the origin 
of coordinates. The origin of a coordinate system given, the posi- 
tion of an arbitrary (variable) point M in spaGe is completely spe- 
cified by the vector r = OM. This vector is called the radius-vector 
of the point M (see Fig. 156). As proved in Sec. 5, we can take the 
vectors a, b and c as a basis and represent the radius-vector in the 
form r = Xa + pb -f- vc. Thus, the position of the point M is cha- 
racterized by the triple of numbers X, p and v which are called the 
affine coordinates of the point M. In accordance with the end of 
Sec. 5 we can say that the affine coordinates of a point are the coor- 
dinates of its radius-vector. Therefore, every point in space, just 
as a point in a plane, has certain coordinates and, conversely, the 
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coordinates being given, we can always construct the corresponding 
point (but in space a point has three coordinates). 

If vectors chosen as a basis of a coordinate system have unit 
lengths and are mutually perpendicular the coordinate system is 
called a Cartesian sj’stem. In such a case vectors forming a Cartesian 




basis are usually designated by the letters i, j and k. Cartesian coor- 
dinates are usually denoted as x, y andz. Thus, according to Fig. 157, 
we have 

r = xi + yj + zk (7) 

In Fig. 157 we see the point M with the coordinates x — 2, 
y = 2.4 and z — 1.6. The planes xOy, yOz and xOz (the coordinate 
planes) divide the space into eight parts (octants). The signs of the 
coordinates of a point show in which octant the point lies. 

Similarly, any vector a can be represented (with respect to a coor- 
dinate system) in the form analogous to (7): 

a = a x i Cyj + a,k (8) 

where a x , a y and a z are the projections of the vector a on the cor- 
responding coordinate axes. Taking into account that for any unit 

vector e we have, by formula (2), proj; e = cos (e, Z), we deduce 
from formula (8) the relation 

e = cos (e, x) i -f- cos (e, y) j + cos (e, z) k 

10. Some Simple Problems Concerning Cartesian Coordinates. 

1. Operations on vectors represented in terms of their projections- 
on the axes of a Cartesian coordinate system are performed according" 
to the following simple rules. If [see formula (8)1 we have 

a = a x i + a y j + a z k and b = b x i -f b y ] ~f 6 2 k 
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then (see Secs. 2, 4 and 8) 

a + b = (a x + b x ) i + ( a y + b y ) j + ( a z + b z ) k (10) 
Xa = Xa x i + tatyj + ^a 2 k (11) 

a-b = a x b x + a y b y + a z b z (12) 

a 2 — a- a = a% + a~ y + a\ (13) 

To deduce the last two formulas which are of great importance we 
should notice that i-i = j-j = k-k = 1 and i-j = j-k = k-i = 0 
since i, j and k are mutually perpendicular unit vectors. Formula 
(13) is nothing but the expression of Pythagoras’ theorem in space: 
the square of the length of the diagonal of a rectangular parallele- 
piped equals the sum of the squares of the lengths of its sides. 

For example, let it be required to determine the angle between 
the vectors a = 3i — 2j + k and b = — 2i + j + 4k. We have, 
by formula (3), 


/ , a-b 

c°s ( a , b) = — r- = 




— 2) — ( — 2) 1 -1-1-4 

-2)2+12 y(-2)*+l*+4s 


-4 

yi+2i 


-0.233 


from which we obtain, within the accuracy guaranteed by a slide 
rule, (a,"b) = 103.5°. 

2. Tiie parallelism and perpendicularity conditions for vectors 
given in the form of representation (9) can be derived as follows. 
The condition a || b is, by Sec. 5, equivalent to the relation of the 
form b = P.a or, on the basis of formula (11), b x = Ka x , b v — ha y , 
b z = Xa z . Now, eliminating X, we get the desired condition 

— (which is the condition of a||b) 


Further, according to Sec. 8 (property 1), the condition a j_ b 
is equivalent to the equality a-b = 0 and, by formula (12), we 
deduce the condition 

&xb x + fly by + a z b z — 0 (which is the condition of a 1 b) 

3. The cosines of the angles which a vector forms with coordinate 
axes are called the direction cosines of the vector. If a vector a is 
given in the form of its representation (8) then we have, by formula 

(2), a x = proj* a — a cos (a, x) etc., that is 

cos (a, 2)=-^-, cos (a, y) = -^- , cos (afz) = ~ 
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This implies 

cos' 2 (a, x ) + cos 2 (a, y) -f cos 2 (a, z) = +-J- = 1 

The direction cosines of a vector completely determine its direc- 
tion but give no information about its length. 

The direction cosines of an axis are defined in like manner: they 
are the direction cosines -of an arbitrarily taken vector parallel 
to the axis and having the same direction. 



Fig. 158 . - Fig. 159 


4. The vector connecting two given\poinis' .M i (aq, y i, z 4 ) and 
M z (a:,, y 2 : z 2 ) can be found as follows. According to Fig. 158 we 

have 0M X — rj = aqi + i/, j + z t k and OM 2 . = r 2 — x 2 i + y 2 ] + 
-f z 2 k which implies (see Sec. 3): 

MiM 2 = r 2 — rj = {x 2 — aq) i + (y 2 — y t ) j + (z 2 — Zj) k 

5. The distance between' two points Mi (aq, y it z t ) and 
M 2 [x 2 , y 2 , z 2 ) is equal to 

M \M 2 = V (MiM 2 )- = Y (*z- *i) 2 + (i/2- Vi? + ( 22 - zi) a (14) 

which follows from the preceding argument. This formula is very 
much like the corresponding formula for a plane [see formula (11.1)1. 

6. Dividing a segment in a given ratio. Suppose we have 

Mi(x u y u zi), M 2 (x 2 , z / 2 , z 2 ) and k 

(see Fig. 159). It is required to find the point M (x, y, z). We have 
r i = aqi “f* V\\ + r 2 = a: 2 i -j- y 2 j -f- z 2 k (15) 

The vectors M t M and M M 2 being parallel, we see that MiM — 

— /MM 2 . But MiM = r — Tj and MM 2 = r? — r, that is r — r t = 
15-0141 
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= (r 2 — r), r — r 4 = kv 2 — kr and r + kr = r t + kr 2 . Hence, 


n + ?-r 2 

1 +?, 


(16) 


Equating the projections of both sides on the axes x, y and z 
[see formulas (7) and (15)] we finally deduce 


x l d" V\ J rky 2 _ Zl 4- kz 2 

I-} -k ’ Z_ 1-fX 


(17) 


When passing from formula (16), which represents the solution 
of the problem in a vectorial form, to formula (17), we have projected 



the vectorial formula on the coordinate axes. Generally, it is clear 
that every vectorial equality of the form a = h considered for vec- 
tors in space is equivalent to the three scalar equalities 

a x = b xl dy = by, a z = b z 

which are obtained as a result of projecting the former equality 
on the coordinate axes. 

7. Translation of coordinate axes. Let the coordinate axes x , y' 
and z' be obtained by means of a translation of the axes x, y and z, 
the displacement vector being a (see Fig. 160). Then we have the 
relation r = a -j- r' for the radius-vectors of any point M. Projecting 
this relation on the coordinate axes we obtain the formula 

x = x' + a x , y = y' -f- a y , z = z' - f a z 

which describes the connection between the new coordinates and 
the old ones (see problem 3.1 in Sec. II. 2). 
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§ 4. Vector Product of Vectors 

11. Orientation of Surface and Vector of an Area. A surface in 
space is called oriented if there is an indication as to which of its 
sides is regarded as outer and which as inner. As a rule, such an 
orientation can he performed in two ways (see Fig. 161). Even when 
we have a closed surface (e.g. a sphere) it is sometimes convenient 

Inner side 



to introduce an “unnatural” orientation, that is to regard the outer 
(in an ordinary, “everyday” sense) side of the surface as inner and 
vice versa. 

An orientation of a non-closed surface can also be determined 
by pointing out the direction of describing its contour. Thus, we 


Outer side 


Outer side 


II 


Right-handed screw 



Left-handed screw 


Direction of 
describing p, 
the contour 


Direction of 
describing 
the contour 



Fig. 162 


have two methods of indicating an orientation of a non-closed sur- 
face. To establish the connection between them it is necessary to 
point out, in addition, whether we take the right-hand screw rule 
or the left-hand screw rule which are illustrated in Fig. 162. For 
instance, the right-hand screw rule can be stated in the following 
way: if we take a right-handed screw (which is usually used in engi- 
neering and everyday life) and rotate it in the direction of describing 
the contour it must move from the inner side of the surface to the 
outer side. Or, in other words, if we imagine that a man walks on 

15 * 
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the outer side of the surface along its contour in the positive direc- 
tion of describing the contour he must see “the precipice" on the 
right, and the surface itself should remain on the left. 

When we consider an oriented part of a plane it sometimes turns 
out that only its area and its orientation in space are important 
whereas the specific form of the part (that is whether it is a circle 
or a rectangle etc.) does not matter. In such circumstances we can 
represent this part (S) of the plane by means of a vector which is 
perpendicular to ( S ) and is directed from its inner side to the 
outer side (see Fig. 163) whereas its absolute value is taken 




equal to the area of ( S ). We shall call such a vector the vector 
of the area ( S ) and denote it as S. Obviously, this vector com- 
pletely defines the area and the orientation in space of such a sur- 
face element. 

Let us discuss an example of using the vector of an area. Suppose 
we have a homogeneous gas flow, that is a flow in which the velo- 
city v of the particles is the same at all points. Now imagine that 
we put an oriented plane surface element ( S ) into the flow, and 
it is required to determine the volume of the gas which passes 
through ( S ) during the unit interval of time from its inner 
side to the outer side. Since the volume of the gas passing 
in unit time fills a cylinder with base (S) and altitude 

|v | cos (S, v) (think why it is so), the sought-for volume is equal to 
I S | | v | cos (S, v) = Sv. 

12. Vector Product. The vector product of two given vectors a 
and b is defined as the vector of the area of the parallelogram con- 
structed on the vectors a and b (when the vectors are translated to 
a common origin) and oriented so that we should begin with the 
first vector (i.e. with a) while describing the contour of the paralle- 
logram in the positive direction. The definition is illustrated in 
Fig. 164. We shall use the right-hand screw rule throughout our 
course unless the contrary is stated. We have also used the right- 
hand screw rule in Fig. 164. - - - * 
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Now let us introduce the notion of a directed triad of vectors 
which is important for our further purposes. Let three vectors a, b 
and c with a common origin be given. We shall regard the vector!, 
as taken in a certain order, that is a is the first vector, b the second 
and c the third one. Let us also suppose that the vectors do not 



lie in the same plane. Such a triad is said to be right-handed if the 
shortest rotation from the vector a to the vector b is seen to be in 
the counterclockwise direction when we contemplate the rotation 
from the terminus of the vector c. If the rotation is seen to be in 
the clockwise direction the triad is called left-handed. Right- and 



left-handed triads are shown in Fig. 165. The origin of this termi- 
nology is illustrated in Fig. 166. Note that if we interchange the 
numbers of two vectors retaining the third one at its original place 
t len the orientation of the triad will change. For instance, if a 
triad a, b, c is right-handed then the triad a, e, b is left-handed 
(c eck it!). The orientation of a triad does not change when we per- 
form the so-called circular permutation, that is when the third 
vector is substituted for the second vector and the first vector is 
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substituted for the third one (or the other way round). For example, 
if a triad a, b, c is right-handed then the triad b, c, a is also right- 
handed. 

The orientation of a Cartesian triad i, j, k must always correspond 
to the screw rule that we choose. Thus, i, j, k must form a right- 
handed triad in case the right-hand screw rule is chosen and they 
must form a left-handed triad if otherwise. In accordance with the 
rule we distinguish between the so-called right-handed and left- 
handed Cartesian coordinate systems. For instance, the Cartesian 
system in Fig. 157 is a right-handed one. Now we can formulate one 
more definition of a vector product which, as it can be seen in 
Fig. 164, is equivalent to the former definition. 

The vector product of two vectors a and b is a vector c which is 
directed perpendicularly to either vector, has an absolute value 
equal to the area of the parallelogram constructed on the vectors a 
and b and forms a triad with the vectors a and b (the triad a, b, c) 
directed as the triad i, j, k (that is the triads a, b, c and i, j, k have 
the same orientation). Thus, the triad a, b, c is right-handed or 
left-handed in accordance with the Cartesian triad i, j, k. The vector 
product of the vectors a and b is denoted as a / b or [a, hi. 

13. Properties of Vector Product. 

1. The absolute value of the vector product a X b is equal to 

| a x b | = ab sin (a, b). Indeed, this expression corresponds to 
the formula of calculating the area of a parallelogram. 

The last formula together with formula (3) and property 2 in 
Sec. 8 implies the following consequence: 

(a X b) 2 + (a*b) 2 = | a X b | 2 + (a<b) 2 = 

= a 2 b 2 sin 2 (a, b) -\- a 2 b 2 cos 2 (a, b) = a 2 b 2 


It should be understood that the first summand on the left-hand 
side is the scalar square of a vector (i.e. of a ^ b) whereas the second 
one is the square of a scalar (of the scalar (a -b)] . 

2. A vector product is equal to zero if and only if the vectors are 
parallel: 

a / b = 0 is equivalent to a || b 

Indeed, if the vectors are parallel then the corresponding paralle- 
logram degenerates into a line segment having the zero area and 
vice versa. In particular, we always have a X a = 0. 

3. The vector product is anticommutative: 

b X a = —(a X b) 

Really, changing the order of the factors does not affect the form 
of the parallelogram but its contour will be described in the opposite 



VECTORS 


231 


direction and the vector of the surface element will therefore be 

replaced by the opposite one. , 

4. A scalar factor can be taken outside the vector product. 

(la) X b = a X (Ah) = l (a X b) 

since increasing the length of one of the sides of a parallelogram 
l times results in increasing its area X times. (If X < 0 then the 
directions of the corresponding vectors are replaced by the opposite 
directions but the above rule remains true.) 

5. Distributive law: 

(a + b)Xc = aXc + bXc, 
cX(a-fb)=eXa -f-; c x b (18) 

To prove the law let us translate the vectors a, b and c to a com- 
mon origin 0 and draw a plane (P) ± c through O (see Fig. 167). 



Fig. 167 

OfC,SjR t is the projection of the parallelogram OKSR; OK'S'R is obtained by turning and 
stretching the parallelogram OJfiSjil, n2 ij -3 

Now we construct a parallelogram on the vectors a and b, its dia- 
gonal being d. Let us project the parallelogram on the plane ( P ). 
Next we turn the projected parallelogram about the axis c through 
90° and simultaneously extend it c-fold (the whole construction 
is shown in Fig. 167). It is clear that 

d = a + b, d' = a' + b' (19) 

where a and b are the sides of the third parallelogram and d' is 
its diagonal. 

For the sake of convenience the operations on the vector a per- 
tormed in the above construction are depicted separately in Fig. 168. 
TVe see that a' j_ OK and a' ± c [because c _L (i 5 )] and the vector 
- nr i fore Perpendicular to the plane KOM. Besides, a' = 
P roduct bein g e^al to the area of the parallelo- 
gram UKLM constructed on the vectors a and c (why is it so?). 
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Consequently, Ty the definition of the vector product, a' = a X c. 
Similarly, b' = b X c and d' = d X- c. Now we deduce from for- 
mulas (19) the first formula (18): (a + b) X c = d x c = d' = 
— a' + b 1 = a / c + bxc. Changing the order of the factors 
and simultaneously changing the signs we receive the second for- 
mula (18). 

These properties enable us to remove brackets in expressions con- 
taining vector products but in doing this we should pay attention 

to the order of factors. Here we give 
an example: 

(a + 2b) x (2a — 3b) = 2a X a — 
— 3a x b 4b X a - 6b X b = 
= —7a X b 



It should be noted that the asso- 
ciative law does not hold for the 
vector product: (a b) x c may be 
unequal to a (b ' c). That is why 
expressions of the form a X b - c 
must not be used without brackets. 

6. Expressing vector product in 
terms of Cartesian projections. 

Let vectors a and b be given in the form of their representations 
(9). To express their vector product in terms of their Cartesian pro- 


Fig. 168 


jections we shall utilize the following equalities: 


i X j = k, j X i = — k, j X k = i, k X j = — i, 

k X i = j, i X k — — j 


An essential fact is that these equalities do not depend on the par- 
ticular choice of a Cartesian coordinate system and on the choice 
of the right-hand or left-hand screw rule. The verification of these 
equalities is left to the reader. Now we have 

a b = (a*i + a^j + a.k) X (b K i -f b tJ ) -f b z k) = 

= a x b y k — a x b z j — a y b x k + a^i + a z b x ] — a z b y i = 

=== 1 (&ybz Q*z by*) J Q'Z^x) ""f" k (0- x b y a yb x ) (30) 

The result thus obtained can be rewritten in the form of a deter- 
minant [see formula (VI.8)]: 


a X b = 


by bz 


This is a remarkable formula! In this form the formula for the vector 
product is easy to memorize. 
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Suppose it is necessary to evaluate the area .S' of a parallelogram 
constructed on the vectors a = 3i — 2j + k and b = — 2i + 3 + 
+ 4k. As we know, S = | a \ b |. Calculating we find 

i 3 k 

axb= 3 — 2 1 = i( — 8 — 1) — j(12 + 2) + k(3 — 4) = 

— 2 1 4 

= — 9i — 44j — k 

and thus 

S = | a x b | = YW+W+V = 16.7 

The result is dimensionless since the vectors a and b were dimen- 
sionless. If we wanted to receive the “genuine area” we should put, 
for example, a = 3i cm and the like. 

Now we point out one more useful formula.' Let us take two vec- 
tors a = a x i + a y j and b = b x i + b v j in the x, y-plane. Suppose 
it is necessary to evaluate the area S of the parallelogram construc- 
ted on a and b. Since ; 

i j k 

a x 4 dy 

a x b= ct x a tJ 0 = ' k 

•' b, b y 6 

the geometrical meaning of the vector product implies (verify it!) 

= ±S, that is 5 = | j X * v | (21) 

b x by 

We take -}- or — depending on whether the direction of the shortest 
rotation from a to b coincides with the direction of the rotation 
from i to j or not. 

The notion of the moment of a vector a applied at a point M about 
a fixed point O is connected with the notion of a vector product. 

The moment mom 0 a is defined by the formula mom 0 a = OM a = 
— r X a. In physics we consider the moments of a force, of a velo- 
city, of a momentum and so on. If we change the position of the 
origin of the vector a then, in general, its moment will also change. 
Let us now translate the vector a along its “line of action” to a point 

HI'. Denote a in the new position as a'. Since MM' — Aa we have 

mom 0 a' = r' > a = (r + Aa) X a = r X a = mom 0 a 

Thus, such a translation does not affect the moment and we may 
therefore regard the vector a entering in the definition of a moment 
as a sliding vector (see Sec. 1). 

14. Pseudovectors. There are vectors which depend on whether 
we choose the right-hand or the left-hand screw rule so that when 
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one of the rules is replaced by the other the directions of these vec- 
tors are replaced by the opposite ones. Such vectors are called pseudo- 
vectors (or axial vectors) in contrast to true vectors whose direction 
is independent of the choice of a screw rule. For instance, the velo- 
city vector of a translatory motion of a solid does not depend on 
the choice of a screw rule (this is implied by its physical meaning) 
and is therefore a true vector. On the contrary, the angular velocity 
vector to of a rotary motion of a solid is a pseudovector. Indeed, 




Fig. 169 


such a vector lies in the axis of revolution and its absolute value 
is equal to the numerical value of the angular speed but its direction 
depends on the choice of a screw rule (see Fig. 169). 

It is clear that by the definition of the vector product of two true 
vectors a vector product is a pseudovector because if we change the 
screw rule the former outer side of the parallelogram constructed 
on the vectors a and b becomes the inner side and vice versa. The 
same reasoning shows that the moment of a force (see the end of 
Sec. 13) is a pseudovector. Further, the vector product of a true 
vector by a pseudovector is a true vector whereas the vector product 
of two pseudovectors is also a pseudovector. It is also easy to verify 
that the linear velocity v of any point M of a rotating solid which 
is a true vector is connected -with the pseudovector © by the for- 
mula v = (o x r where r = OM and 0 is an arbitrarily chosen 
point lying on the axis of revolution. 

We sometimes also distinguish between “true scalars" and pseudo- 
scalars. A pseudoscalar is a scalar which is multiplied by —1 when 
the original screw rule is changed. For example, it is easy to verify 
that the scalar product of a true vector by a pseudovector is a pseudo- 
scalar. 
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§ 5. Products of Three Vectors 

15. Triple Scalar Product. Let three vectors a, b and c be given. 
The scalar quantity (a X b)-c is called the triple (mixed) product 
of the three vectors a, b and c. The geometrical significance of the 
triple product is seen in Fig. 170: 

(a X b) -c = d-c = d proj d c = | a X b | proj d c — Sh = V 

that is we have obtained the volume of a parallelepiped constructed 
on the vectors a, b and c. The vectors a, b and c in Fig. 170 form 



a right-handed triad, and the volume is obtained with the sign +. 
If the triad were left-handed the angle between c and d would be 
obtuse. In this case (a x b) -c = — V. (We suppose here that our 
considerations are based on the right-hand screw rule as it was poin- 
ted out in Sec. 12.) 

Let us indicate the following properties of a triple product. 

1. A circular permutation of factors does not change their triple 
product since neither the parallelepiped (see Fig. 170) nor the orien- 
tation of the triad of the vectorial factors changes under such a trans- 
formation (see Sec. 12): 

(a X b) *c = (b X c) -a = (c X a) -b 

But if we interchange only two factors the sign of the triple product 
changes. For example, (c X b) *a = —(a x b) *c. 

2. A triple product is equal to zero if and only if the three vec- 
tors are parallel to a plane. Indeed, such a parallelism means that 
the corresponding parallelepiped degenerates into a plane geomet- 
rical figure, that is it has a zero volume. 



238 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


3. By formulas (20) and (12) we deduce the following expression 
of a triple product in terms of Cartesian projections: 


or. 


(a X b)-c = [( a y b z — a z b y ) i — ( a x b z — a z b x ) j + 
d - iflxby ftyb x ) k] "(cj-i -f- Cyj d - c z k) = 

” (flyh z ■ H 2 hy) C x ((L x b z ~~ ®zbx) Cy d“ ( &xby Q'lfix) Cz 

finally, 

dx dy <X Z 

(a X b)-c = b x by b z 

dx dy C Z 


(to verify the formula expand the determinant in minors of the last 
row according to Sec. VI. 3). 

Let three vectors a, b and c be given in the form of the expression 
in terms of their Cartesian projections. Then, by the above formula 
and property 2, we obtain the necessary and sufficient condition 
for these three vectors to be parallel to the 
same plane: 

a x ciy <z 2 

b x by bz =0 ( 22 ) 

c x c y c z 

The triple product of three true vectors (see 
Sec. 14) is a scalar product of a pseudovector 
by a true vector, i.e. a pseudoscalar. 

16. Triple Vector Product. The vector 

(a b) c is called the triple vector product 
tig- 171 0 f thj-ge vectors a, b and c. It has no important 

geometrical meaning but is expressed by a 
formula which is of use for some applications. To deduce this 
formula let us choose the Cartesian coordinate axes in such a way 
that the z-axis is directed along the vector a and the y-axis 
lies in the plane of vectors a and b (see Fig. 171). Then the 
projections of the vector a on the y-axis and on the z-axis will be 
equal to zero, that is a = a x i. Similarly, b = b x i d - b y j and 
c = c A .i d- c yj + c zk. From this we obtain 

i j k 

axb= 0 0 =a x b y k, 

b x by 0 

i j k 

(axh)xc= 0 0 a x by = — iaJjyCy -j- ia x byC x = 

( b X C x d- byCy ) u x i 


| C X Cy C Z | 

^ Q'xCx {b x i "T by j) 
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(check up these formulas!). Finally, using formula (12) we get 
(a X b) X c = (a*c) b — (b-c) a 

This final' formula no longer contains any coordinate projections 
and therefore does not depend on the particular choice of the coor- 
dinate system. 

The following formula is also sometimes of use: 

a X (b X c) = — (b X c) X a = — [(b-a) c — (c-a) b] = 

= (a*c) b — (a*b) c 

In conclusion we note that vector algebra is a comparatively new 
branch of mathematics. It was created and developed in the second 
half of the 19th century in connection with problems of algebra, 
geometry, mechanics and physics. 

§ 6. Linear Spaces 

17. Concept of Linear Space. One of the characteristic features 
of vectors is the possibility to perform linear operations on them, 
that is the operation of addition and the operation of multiplica- 
tion by numbers (see § 1). These operations can also he performed 
on some other objects, such as polynomials or arbitrary functions. 
Since these operations have similar properties in all cases, there is 
every reason to consider the general notion of a linear space which 
is understood as a set of some objects such that the linear operations 
can be performed on them within the set. Such a general, abstract, 
consideration yields a general view on linear operations which 
enables us to find some important properties in concrete problems. 

Let ( R ) be a set (totality) of some objects. We shall call these 
objects elements (members) of ( R ). If a is one of the objects we also 
say that a belongs to ( R ). This fact is designated as a (j ( R ). The 
fact that an element b does not belong to ( R ) is written as b £ (R). 
For instance, if (I) is the set of all integers then 3 £ (/), — 5 £ (/) - 
but it £ (I). 

Now we turn to the strict definition of the concept of a linear space. 

A set (R) is called a linear space if for any x £ (R) and y £ (R) the 
notion of the sum x + y £ (R) is defined in a certain way, and if 
for any real number X the product Xx = xX £ (R) is defined. For 
example, we can regard the set of all vectors for which the addition 
and the multiplication by numbers are performed in accordance with 
the rules of § 1 as a space (R). The set of all complex numbers for 
which the. rules of addition and multiplication by real numbers are 
known from elementary mathematical courses can also he regarded 
as a linear space. Some other examples will be given in Sec. 18. 
Besides, the operations of addition and multiplication must have 
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some natural properties, namely those properties which were proved 
in § 1 for vectors and which should be introduced as the axioms 
of a linear space in the general case. The properties being essentially 
the same as those of vectors, the elements of a linear space are 
also often called vectors and are denoted like vectors. 

Here we shall give the axioms of a linear space without discussing 
the question of their independence. The thing is that some of these 
axioms are the consequences of the rest but this is of no importance 
for the aims of our course. Thus, the sum must satisfy the following 
conditions: 

1. Associativity: (x + y) + z = x -f- (y + z) for any x, y, z £ (i?). 

2. Commutativity: x + y = y + x for any x, y £ (R). 

3. The existence of a zero element in ( R ): there must be an ele- 
ment [denoted as 0; 0 £ (i?)] which satisfies the condition x + 0= 
=x for any x £ (R). 

4. The existence of the negative of x £ ( R ): for any x £ (R) there 
must be an element [denoted as — x; — x £ (/?)] which satisfies the 
condition ( — x) + x = 0. 

It is easily verified that the zero element of a space must be unique 
and that there is only one negative for any element. We shall not 
discuss the general proof of these facts (in all the concrete examples 
which we shall consider here these properties are obvious). 

The multiplication of an element by a number must satisfy the 
following requirements: 

5. k (px) = (?,p) x. 

6. lx = x. 

7. (-1) x = -x. 

8. Ox = 0. 

9. X0 = 0. 

The operation of division by a number is introduced by the formula 

r=r x <^°)- 

Finally, both linear operations are connected by the distributive 
laws: 

10. (^, -f- p) x = kx px and 11. k (x -f- y) = ^x + ky. 

All these properties make it possible to perform linear operations 
in linear spaces and transformations of linear combinations of ele- 
ments (see Sec. 5) according to the usual arithmetical rules. Linear 
spaces and their properties are treated in detail in courses on linear 
algebra (see, for example, [16]). 

It is sometimes necessary to consider sets of elements in which 
only one operation of addition satisfying axioms 1-4 is defined. Such 
a set is called an Abelian group after N. H. Abel (1802-1829), a Nor- 
wegian scientist and one of the most prominent mathematicians 
of the 19th century. Besides, it is sometimes possible to perform the 
multiplication not only by real numbers but also by any complex 
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numbers. A linear space of this kind is called a linear space over 
the field of complex numbers or, briefly, a complex linear space. The 
term “a number field” is applied to any set of numbers in which 
it is possible to perform the four fundamental operations of arithme- 
tic with the natural exception of the division by zero. For example, 
all the real numbers or all the complex numbers form a field whereas 
the set of all integers does not form a number field (why is it so?). 

18. Examples. 

1. One of the simplest examples of a linear space is the set of all 
ordinary vectors with the linear operations described in § 1. The 
set of all vectors parallel to a certain plane is a linear space which 
is a linear subspace of the previous space. The set of all vectors 
parallel to a certain straight line is also a subspace of the space of 
all vectors. The zero vector itself is a linear space, from the formal 
point of view, since all the axioms 1-11 are fulfilled here. 

In the general case a set (Rf) contained in a linear space ( R ) is 
called a linear subspace of (R) if (Rfi) itself is a linear space with 
the operations which are originally defined for elements of (R). In 
other words, there must be x + y £ (i?i) and hi £ (Rf) for any 
x £ (Rf), y £ (Ri) and an arbitrary real X; if these two conditions 
hold the verification of axioms 1-11 is no longer needed since they 
are fulfilled in the whole space (R). 

2. The set of all polynomials P ( x ) of degree not higher than n 
where re 0 is a given integer in a linear space. If n assumes succes- 
sive values 0, 1, 2 etc. we get a sequence of linear spaces and each 
subsequent space contains all the preceding ones as linear subspaces. 
A still more general linear space is formed by the totality of all 
functions / ( x ) defined over a fixed interval. Thus, each of these 
functions / (a;) can be regarded as a vector of the linear space of 
functions or, as we say, of a functional space. This approach is 
characteristic of modern mathematics. 

3. The so-called ?i-dimensional real Cartesian space E n where 
n = 1, 2 , 3 , ... represents a very important example of a linear 
space and we shall consider this notion here. 

We regard each ordered re-tuple (oj, « 2 , . . ., a n ) of real 

numbers cq, o 2 , . . re n _ j, a n as an element of E n . Take for example 
E 4 . Then each element of the form (cq, a„, a 3l af) where a 4 , « 2 , a 3 
and er 4 are arbitrary real numbers is a point or a vector of E 4 (ordi- 
nary geometrical vectors may be regarded as points of E z since we 
can regard them as radius-vectors of points; the vectors, i.e. the 
elements, of a general linear space are therefore also called points). 
The numbers a u a„, a 3 and a 4 are called the coordinates of the point 
(of the vector) (a 4 , a 2 , a 3 , af). For example, (—3, 0, 1, 3) is one of 
such points. The point (0, 0, 0, 0) is called the origin (of the coor- 
dinate system) in E 4 . The whole space E± is the set, the totality, 
of all such points. 
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equation (23) we arrive at a linear dependence between x„ x 2 and x 3 . 
If D — 0 the right-hand sides of the first two relations (23) are 
proportional to each other and therefore even x 4 and x 2 are linearly 
dependent here, the same being true, of course, for x 1; x 2 and x 3 . 

Let us now consider as an example the linear space of all polyno- 
mials P ( x ) of degree <12 (see Sec. 18), that is of polynomials of the 
form ax 2 -f- bx + c. All the polynomials are linear combinations 
of the powers x 2 , x 1 = x and a: 0 = 1, the powers themselves being 
linearly independent. Indeed, none of the powers is a linear com- 
bination of the rest. It follows that the space in question is three- 
dimensional. The above lemma implies that any set consisting 
of more Ilian. three such polynomials is linearly dependent. The 
elements 1 , x and x 2 form a basis in this space. 

Similarly, we conclude that the space of polynomials of degree 
has the dimension n -j- 1. The space of polynomials of all de- 
grees is infinite-dimensional. 

The linear space E n (see Sec. 18), as we could naturally expect, 
is n-dimensional in the sense of our definition. We shall illustrate 
this property by taking the space E u as an example. Let us introduce 
the following vectors e,, e 2 , e 3 and e 4 : 

Ci (1, 0, 0, 0), e 2 (0, 1, 0, 0), e 3 (0, 0, 1, 0), 
e 4 (0, 0, 0, 1) 

Obviously, the vectors are linearly independent. Besides, every 
vector x {x v x 2 , x 3 , x 4 ) belonging to E 4 is represented as a linear 
combination 'of e ls e 2 , e 3 and e 4 : 

x (x i: x„, x 3 , x 4 ) = x^e l -)- x 2 e 2 -f- x 3 e 3 

Hence, by the lemma, the linear space in question is four-dimen- 
sional. The vectors e t , e 2 , e 3 and e 4 form a basis in E 4 . 

There is an infinitude of bases in every finite-dimensional linear 
space. For instance, let us consider the space of polynomials of 
degree ^2. Choose three arbitrary values x = x lt x = x 2 , 
x = x 3 and denote the polynomial of the second degree which is 
equal to 1 at the point x — x h (k — 1, 2, 3) and equal to zero at 
other points x t (l = 1, 2, 3, l =^= k) by P h (x). We can easily verify 

that P,(x) — ^ - The polynomials P 2 (x) and P 3 (x) are 

expressed similarly. The three polynomials P 4 (x), P» (x) and 
P 3 (x) are a basis in the space in question. The representation 
of an arbitrary polynomial of degree ^2 as a linear combination 
of Pi (x), P 2 (x) and P 3 (x) is nothing but a special case of Lagrange’s 
interpolation formula (V.23). Of course, we suppose here that 
Xfc ^ X(, Z, k = i, 2, 3, k =£ l. 

We have already mentioned the notion of a linear subspace ( R } ) 
(see Sec. 18) of a linear space ( R ). It is possible to prove that if 
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m ) (R) and if the space (R) is finite-dimensional the dimension 

of {Ri) is less than that of (B). The proof which we leave to the 
reader is rather simple, it is based on the definition of a dimension. 
By the way, it is convenient to use the following definition of the 
linear independence which is equivalent to the former definition 
when proving the above statement: vectors a, b, . . d are called 
linearly dependent if they satisfy a relation of the form aa + 
+ pb + . . • + Sd = 0 where at least one of the numbers a , p, ... 

. . ., 6 is different from zero. 

The important notion of a hyperplane is introduced with the aid 
of the notion of a linear subspace. Let us take a linear subspace 
of E n and draw all the vectors of the subspace from a certain point 
of E n . Then the set of points which are the termini of the vectors 
is a hyperplane. One-dimensional hyperplanes in E n are called 
straight lines and two-dimensional hyperplanes in E n are called 
planes. Besides, there are hyperplanes of dimensions 3, 4 etc. up 
to n — 1. We also introduce the formal notion of a “hyperplane 
of dimension 0” which is simply a separate point and the notion 
of a “hyperplane of dimension n" which is the whole space E n . It 
is possible to construct “a stereometry” in E n . 

In conclusion we mention an important notion' of an isomorphism 
of linear spaces. Two linear spaces '(B) and (R') are called isomorphic 
(or, more precisely, linearly isomorphic) if it is possible to establish 
a one-to-one correspondence between the vectors of the spaces in 
such a way that the linear operations on the corresponding vectors 
are performed according to similar rules. A more comprehen- 
sive statement of this property is that if the vectors x, y £ (R) 
correspond, respectively, to the vectors x', y' £ (R') then the vector 
x + y must correspond to the vector x' -f y' and the vector 
lx must correspond to the vector lx' , i.e. 

(x + y)' == x' -f y', (lx)’ = lx' 


where the prime designates the transition from a. vector of (R) to 
the corresponding vector of {R'). The term “one-to-one correspon- 
dence” means that only one vector 'from (R) corresponds to every 
vector from (R') and vice versa. Isomorphic linear spaces are indis- 
tinguishable from the point of view of linear operations since all 
the implications based on such operations in one of the spaces are 
true for all the isomorphic ones. 

In particular, it follows that isomorphic linear' spaces are of the 
same dimension. Conversely, all the finite-dimensional linear spaces 
of one and the same dimension are isomorphic to each other Indeed 

I; • • •’ P% is a basi + s , “ C«) aDd ^ p;, • • ., p; is a basis 

in [R ) it is easy to verify that the correspondence of the form 

«iPi + «2P 2 + • • • + a n p„ ^ ttl p; + a 2 p' 4 . . . + a n pn 


16 * 
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yields a desired isomorphism. Hence, with respect to linear 
operations, every finite-dimensional linear space is completely 
characterized by its dimension. In particular, we conclude that 
each finite-dimensional linear space of dimension n is isomorphic 
to the vector space E n . 

The theory of finite-dimensional linear spaces and its applications 
are studied in linear algebra whereas infinite-dimensional spaces 
are considered in functional analysis. 

20. Concept of Euclidean Space. A linear space ( R ) is called a 
Euclidean space if the notion of a scalar product of any two vectors 
of (R) is introduced and if the following natural axioms of a scalar 
product are fulfilled: 

12. The scalar product x-y = (x, y) of any two vectors x, y £ (R) 
is a real number; 

13. (x, x) >0 for any x =4= 0; 

14. (x, y) = (y, x); 

15. (x + y, z) = (x, z) + (y, z); 

16. (?.x, y) = X (x, y). 

All these properties were proved for ordinary geometrical vectors 
in § 3. Thus, the set of all geometrical vectors is a Euclidean space. 
The set of all vectors parallel to a plane is also a Euclidean space. 
The same is true for the set of all vectors parallel to a straight line. 
It is clear that a linear subspace of a Euclidean space is always 
a Euclidean space. 

The set of all vectors of the n-dimensional Cartesian space E n 
represents an important example of a Euclidean space if we intro- 
duce the scalar product of any two vectors x (x t , x 2 . . . ., x n ) and 
y (j/i, y 2 - • • •> y n ) by the formula 


(x, y) = xtfi + x 2 y 2 + . . . + x n V n (24) 

which is analogous to the well-known formula (12). It is easy to 
verify that all axioms 12-16 are fulfilled here. 

It is possible to introduce the notion of a length or, as we often 
say, of a norm of any vector by means of the formula | x | = 
= y (x, x). The norm of x is also denoted as jj x [j. Axiom 13 implies 
that the norm of any nonzero vector is positive. If the norm of a vec- 
tor is equal to unity the vector is called a normalized vector (for 
ordinary geometrical vectors we use the term “a unit vector”). From 
axioms 14 and 16 it straightway follows that 

| f.x | = K(tac, Xx) = V X 2 (x, x) = | X | ]4( x , x) = | X j | x | 

This implies, in particular, that | 0 | = 0 and that each vector 
x ^4= 0 can be normalized, that is we obtain a normalized vector 
by dividing x by | x.|. 
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Let us now deduce an estimate for a scalar product of two arbi- 
trary vectors x, y £ (if). In order to do this we note that we have 

0 < | x + Xy | 2 = (x + Xy, x + Xy) = 

= (y, y) X- + 2 (x, y) X + (x, x) (25) 

for any X. If we regard the right-hand side of (25) as a quadratic 
trinomial in X the retention of its sign implies that the discriminant 
of the trinomial is non-positive, i.e. 

(x, y) 2 — (x, x) (y, y) < 0 
(why is it so?). From this we receive 

I (x, y) | = V (x, y) a <V (x, x) (y, y) = | x j | y | (26) 

This important inequality was established in 1821 for E n by Cauchy 
(Cauchy in fact used a terminology different from that of the theory 
of linear spaces). For some other cases the inequality was deduced 
in 1859 by V. Ya. Bunyakovsky (1804-1889), a prominent Russian 
mathematician, and by the German mathematician H. Schwarz 
(1843-1921) in 1884. 

By inequality (26), we obtain, in particular, 

I x + y | 2 = (x + y, x + y) = (x, x) + 2 (x, y) + (y, y) < 

< | x | 2 + 2 | x | | y | + I y I 2 

which implies 

|x + y|<|x|-Hy! 

The last inequality is called the triangle inequality (think why it 
is called so). 

A Euclidean space over the complex number field (see Sec. 17) or, 
as we briefly call it, a complex Euclidean space is also considered in 
mathematics. In this case a scalar product of two vectors can be 
a complex number and axiom 14 is therefore replaced by a new 
axiom of the form (x, y ) = (y, x)* where the asterisk indicates 
the conjugate complex number. In particular, this implies that 
(x, Xy) = (Xy, x)* = [X (y, x)]* = X* (x, y). We can easily verify 
that all other assertions of this section remain true for. complex 
spaces. The n-dimensional complex Cartesian space Z n with the 
vectors of the form x (x u x 2 , . . ., x n ) where x u . . ., x n are arbi- 
trary complex numbers is an important example of a complex 
Euclidean space if we introduce a scalar product by means of the 
formula (x, y) = x^y* + x 2 y* + • • • + x n y% which substitutes for 
formula (24) . We leave to the reader the verification of all the axioms 
and, in particular, axiom 13 for this case. 

21 . Orthogonality. The notion of an angle between any two vectors 
is introduced in a natural way in a Euclidean space ( R ) by the 
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formula 

cos (x”7y) = jxTj^T (27) 

On the basis of Sec. 7 we see that the definition yields a usual angle 
in the case of ordinary geometrical vectors. It is necessary to note 
that, by inequality (26), the right-hand side of equality (27) does 
not exceed unity in its absolute value and therefore definition (27) 
always makes sense. In case vectors x and y are linearly dependent, 
i.e. y = ? k x, formula (27) implies (x, y) = 0° if %. >0 and (x, y) = 

= 180° if }<. < 0. Conversely, if cos (x, y) = ±1 then inequality (27) 
shows that the vectors x and y are linearly dependent, but we leave 
the proof to the reader. 

In the case 

(x, y) = 0 (28) 


we have an important situation when the vectors x and y form an 
angle of 90° or, as it is often said, the vectors are orthogonal. If at 
least one of the vectors x or y is equal to 0 relation (28) necessarily 
holds and therefore the zero vector is regarded as orthogonal to any 
given vector although we can attribute any value to the angle which 
the zero vector forms with a given vector. 

A system of vectors a, b, . . ., d is called orthogonal if all the 
vectors are mutually pairwise orthogonal and none of them is equal 
to the zero vector. Such a system is always linearly independent. 
In fact, if we take the scalar product of an equality of the form 

d = aa -h Pb + . . . by a we receive 0 = a (a, a) which implies 

a = 0. Similarly, we conclude that all other coefficients on the 
right-hand side of the last linear combination are equal to zero 
which is impossible. 

It is most convenient to use an orthogonal basis in a finite-dimen- 
sional space, that is a basis which is simultaneously an orthogonal 
system. In fact, if p l5 p 2 , . . ., p„ is such a basis then any vector x 
is represented in the form x = ajpj -f- a 2 p 2 + • • • + cz n Pn. and 
to find the coefficients of the representation it is sufficient to mul- 
tiply (in the sense of a scalar product) both sides by the vectors 

Pa (k = 1, 2, . . ., n). By the orthogonality, this yields (x, p A ) = 

= (Pft. Pa) which implies 


a h 


(x, Pa) 
(Pa. Pa) 


(*= 1, 2, ...,n) 


(29) 


If we had a non-orthogonal basis the above method would give 
a system of n equations of the first degree in n unknowns a For- 
mula (29) becomes especially simple when the vectors which form 
an orthogonal basis are normalized (see Sec. 20). Indeed, then the 
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right-hand sides of (29) have denominators equal to unity. An ortho- 
gonal basis of normalized vectors is called a Euclidean 1 basis (com- 
pare with Sec. 9) or an orthonormal basis. 

An orthogonal basis can be constructed in any finite-dimensional 
Euclidean space. Actually, let, for example, (R) be four-dimensional 
and let q 4 , q 2 , q 3 and q 4 be a basis in ( R ). We put p 4 = q t and p 2 — 
= q 2 -f- ap^and try to choose a so that (pj, p 2 ) should be equal 
to zero: (p 2 , p 4 ) = 0. This yields (q 2 + ap lt p 5 ) = 0, i.e. a = 
= — (q 2 , pi) (Pi, pi) -1 . Then we put p 3 ' = q 3 + PiPi + P 2 P 2 and 
choose Pi and p 2 in such a way that (p 3 , p^ = 0 and (p 3 , p 2 ) = 0. 
This implies (check it!) p 4 = — (q 3 , pi) (pi, Pi) -1 and p 2 = 
= — (q 3 , P 2 ) (P 2 , Pz)" 1 - Finally, putting p 4 = q 4 + T 1 P 1 + Y 2 P 2 + 

y 3 p 3 we determine the coefficients y k (k — 1, 2, 3) in such a way 
that (p 4 , pO = 0, (p 4 , pa) = 0 and (p 4 , p 8 ) = 0. We leave the 
determination of yi, (k = 1, 2, 3) to the reader. It is easy to show 
that by the linear independence of the vectors q ft (k = 1, 2, 3, 4) 
all the denominators entering in this procedure are different from 
zero and this orthogonalization process can therefore be completed. 
The vectors p 1? p 2 , p 3 and p 4 thus obtained form an orthogonal 
basis in ( R ). 

It follows that all the Euclidean spaces of the same dimension 
are isomorphic to each other. Let us discuss here this important 
corollary. Two given Euclidean spaces are called isomorphic if it 
is possible to establish a linear isomorphism (see Sec. 19) between 
them which preserves the scalar product, that is (x, y) R = (x\ y') R ' 
(the subscripts R and R' indicate here the spaces in which the scalar 
products are taken). Two isomorphic Euclidean spaces are indistin- 
guishable from the point of view of the theory of Euclidean spaces. 
It is obvious that two isomorphic Euclidean spaces are of the same 
dimension. But it is easy to show that, conversely, two Euclidean 
spaces of the same dimension are isomorphic. To prove this we 
choose an orthonormal basis in each of the spaces (by the way, if 
we normalize orthogonal vectors their orthogonality is preserved). 
Let these bases be p t , p 2 , . . ., p n and p(, p 3 , . . ., pJt, respecti- 
vely. Now we establish an isomorphism by using the procedure 
described in Sec. 19: 

aiPt + cc 2 p 2 + . . . + a„p„ -<-* ccjPj + a 2 p' + . . . -f a n \)' n 

Actually, if x = a,pi + . . . + a„p„ and y = Pip t + . . . +'p„p„ 
then 

( x > J)r ~ + • • • + o: n p„ = (x', y') B . 

(why is it so?). In particular, we conclude that every finite-dimen- 
sional Euclidean space of dimension n is isomorphic to the vector 
space E n . The same is also true for complex spaces. 
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§ 7. Vector Functions of Scalar Argument. Curvature 

22. Vector Variables. Let us return to usual geometrical vectors 
Vector variables are closely related to scalar variables. Indeed 
if, for example, we introduce a Cartesian coordinate system an 
write 

u = uj -f- u y j + u z k (3C 

then it is seen that every change of the vector quantity u reduce 
to certain changes of the scalar quantities u x , u y and u z . We shal 
give only a few simple remarks. 

A vector quantity is infinitesimal (we denote this as u -y G 
if its absolute value | u | is an infinitesimal variable, i.e. u-y C 
But the direction of the vector u may change in an arbitrar 
way in such a process and may not have a limit. 

The vectorial limiting relationship u — v- a (where a is a constan 
vector) is equivalent to the three scalar relationships u x -y a x 
u v —y a v and u z —y a z . In addition, if a ^ 0 then u tends to a no 
only in its absolute value but also in its direction. The theorem 
on the limits of a sum, of a product and the like (see Sec. III. 5 
remain true here and their proofs hold too. But, of course, the pro 
perties of limits which are connected with inequalities do not appl; 
to vectors since, as it was agreed before, we do not consider 111 
notion of an inequality for vectors in our course. 

23. Vector Functions of Scalar Argument. We say that there i 
a vector function of a scalar argument u = f (t) if to each valui 
of the scalar variable t there corresponds, by a certain law, a valu 
of the vector quantity u. 

According to the beginning of Sec. 22, the determination of om 
vector function is equivalent to the determination of three scala 
functions (understood in the ordinary sense) because u x — f x {t) 
u v = fy ( t ) and u 2 = f z ( t ). 

The concept of continuity of a vector function of a scalar argu 
ment is introduced in the same way as for scalar functions. Further 


u' = f' ( t ) — lim = li™ 

A!— 0 A(-»0 


f (I + AQ-f (t) 
At 


All the basic properties of a derivative (see Sec. IV.4) are transferrec 
(together with their proofs) to the case of vector functions. But il 
is necessary to stress that while applying these rules to differentiating 
a vector product we must pay attention to the order of factors 
(u X v)' = u' X v -|- u X v'. The concept of Taylor’s series it 
also extended to vector functions (see Eq. IV. 52). 

The following property is sometimes of use: if | u (f) ] = consl 
then u' _L u. Indeed, the condition can be written in the form 
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u . u = u 2 = const. Now differentiating we obtain u'u + uu' — 0 

and 2u'u = 0 which implies u' _L u. 

To illustrate the geometrical meaning of a vector lunction ot 
a scalar argument let us draw the vector u from a fixed point O 
in space. Then it is natural to regard u as a radius-vector (see Sec. 9) 
and denote it by r, i.e. 

r = f it) • (31) 

As t varies the terminus of the vector r describes a curve ( L ) in 
space (see Fig. 172). We can therefore regard (31) as a vector-para- 
metric equation of the curve ( L ). But if we introduce Cartesian axes 




it is easy to pass from (31) to scalar-parametric equations of the form 

z — cp (t), y = (t), z — %(t) (32) 

which determine the same curve. The right-hand sides of (32) are 
the result of projecting the function f ( t ) on the coordinate axes. 
[Compare this with parametric equations (II. 10) of a plane curve.] 
If a space curve is originally defined by equations of form (32) then 
we pass to form (31) according to the formula r = <p (t) i + ij) (£) j -j- 
+ x (£) k- It is convenient to interpret the argument t as time. Then 
the curve ( L ) may be regarded as a trajectory of a moving point. 
This curve is also called the hodograph of the vector function u. 

For example, as it is shown in Fig. 173, the equation r = a + b£ 
where a and b are some constant vectors determines a straight line 
and describes a uniform rectilinear motion along this line with the 
velocity b. Projecting this equation on the coordinate axes we arrive 
at the parametric equations of the straight line: 

x “I" l>xlt V — a y byti z = a z -j- b z t (33) 

As another example, let us take the equation of a screw line (cir- 
cular helix). This curve can be obtained as a superposition of two 
motions of a point: the uniform motion along an axis and the uniform 
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rotation about the same axis (see Fig. 174). We choose the z-axis 
as the axis of revolution and denote the speed of the rectilinear 
motion by v and the angular speed by a. Then we have 

x = R cos at, y — R sin at, z = vt 
or, in the form of a vector equation, 

r = R (cos at i -f- sin at 3 ) + vt k 


If the variable t receives an increment At the point M on the 


curve ( L ) passes to the position N 



(see Fig. 172). Hence, Ar = MN. 
The ratio ^ (which is a vector 

representing the average velocity) 
also lies on the straight line MN 
because At 'is a scalar. When 

At->-0 we have ^ -v r' = v 
At 

(where v is the instantaneous 
velocity). Let us prolong the 
secant MN and watch the change 
of the position of this straight 
line as A t-*- 0, i.e. as N ~y M. 
The secant rotates about the 
point M in this process and tends 
to the position of the tangent to 
the trajectory at the point M. 
Hence we see that the instanta- 


neous velocity vector v = -N is directed along the tangent line to 

the trajectory, this fact being well known in mechanics. The vector- 
parametric equation of a tangent line drawn for a certain value 
t = t 0 has the form (see Fig. 173) 


r — fo Vo ( t — to) 


where r 0 = f ( t 0 ) and v 0 = f' ( t 0 ). We can rewrite this equation as 
follows: Ar = df. But if we take the equation r = f (t) of the curve 
( L ) then Ar = Af. Thus we see that the replacement of Af by df 
is equivalent to the replacement of the motion along the curve ( L ) 
by the uniform motion along the tangent line with the velocity 
equal to the instantaneous velocity at the given moment of time, 
that is by a motion which would appear if all the forces stopped 
acting on the moving point at this moment. 

According to Sec. III. 9 (see example 1) we have, as As->-0, 


As 

Ar 


1, 


Ar 


As 


■1, 


dr 


ds 


— lim 


Ar 

As 


= 1, | dr | = | ds | (34) 
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Here we take the absolute values of As and ds since these quantities 
can be both positive and negative. Let us choose now the arc length 
s as a parameter varying along the curve ( L ) and reckon it from 
a point M 0 of the curve, that is take the equation of the curve in 

the form r = r (s). Then we see that the derivative ^ is a unit 

vector in the direction of the tangent to (L), that is the unit tangent 
vector (see Sec. 7). This vector is usually denoted by the letter 

x. i.e. ~ = t. Consequently, we have 


dr 

dr dr ds dt r 

dt V ds dt ^ ds ' 

It S 


(35) 


We remark in conclusion that formula (34) implies the following 
expression for the differential of the arc length in Cartesian coor- 
dinates: 

ds = ± | dr j = ± | d {xi -f y\ -j- zk) j = ± | cfai -j- dy\ -{- dzlc | = 

= ±V dx- + dy* + dz 2 


24. Some Notions Related to the Second Derivative. Since 
| x (s) | = 1 = const we have ~ ± x (see the beginning of Sec. 23). 
The straight line pp (see Fig. 175) drawn through a moving point 


P 




Fig. 176 


M of the trajectory ( L ) and parallel to ~ is therefore a normal to 

(L) (a perpendicular to the tangent ll at the point M). There exists 
an infinitude of normals drawn to a curve in space at each of its 
points. At each point these normals form a plane which is called 
the normal plane. To distinguish the normal in the direction of the 

vector — from other normals at a given point M we call this normal 



252 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


the principal normal of the curve ( L ) at the point M. The length 

(the absolute value) of the vector ~ is called the curvature of the 

I dx 
— = 

= k and — = fen where n is the unit vector in the direction of the 

ds 

principal normal. 

The geometrical significance of the curvature is shown in Fig. 176: 


dr 

At 

= lim -7 — 

ds 

As 


: lim = lim 
As 


2 sin 


As 


lim-^- 

As 


(in the last passage to the limit we have used property 4 from 
Sec. III.8). 

Thus, the curvature is the speed (the angular speed related to 
the unit of the distance passed over) with which the tangent to the 
curve rotates. Incidentally, we see that the vectors x' s and n go in 
the direction of the concavity of the curve. 

Differentiating the first formula (35) with respect to t we obtain 


<Pr dv . dr dv 

-dfi=!u x + v -Jt=!u x + v 


dr ds dv 

ds dt dt 


x-\-v 2 kn 


This formula is widely applied in mechanics since if t is the time 

tPr 

of motion the formula shows that the acceleration vector can be 

resolved into the components and v 2 kn. ^x is the tangential 

component since it has the direction of the tangent and v 2 kn is the 
normal component (directed along the principal normal). 

Thus, the vector — drawn from a point M must necessarily lie 

in the plane passing through the tangent and the principal 
normal drawn from this point. This plane is called the osculating 
plane of the curve (L) at the point M. Applying Taylor’s formula 
(IV. 50) to f {t o + At) we conclude that the curve (L) may be regar- 
ded in the vicinity of its point as lying in its osculating plane with 
an accuracy up to infinitesimals of the second order (relative to At) 
inclusive. (It can similarly be shown that by formula (IV.49) the 
curve ( L ) coincides with its tangent to within infinitesimals of the 
second order relative to At as At — >- 0.) 

Hence, the osculating plane of the curve ( L ) at the point M may 
be regarded as a plane passing through three points of the curve 
( L ) lying infinitely close to the point M (just as we regard the tan- 
gent line as a line passing through two points of a curve which are 
infinitely close to each other). 

25. Osculating Circle. Let a curve ( L ) in a plane he represented 
parametrically (see Sec. II. 6): x = x {t) and y — y (t). Then, as 
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it is seen in Fig. 177, the curvature k of the curve at any moving 

d arc tan 

point is equal to ^ — — where the dot denotes the 

differentiation with respect to the parameter. Talcing the expres- 
sion of ds obtained in the end of Sec. 23 and performing some trans- 



Fig. 177 Fig. 178 

• As = R Atp 

tan tf=!|=X (see Sec. IV.9) 


formations we receive 



Ydx* + dif- = 


= ■ . 1 . ( yx—yx)dt : ]/x 2 + y 2 dt — 
* 2 -f y°- 


|y x— y x i 

(^ + j>)3/2 


(36) 


In particular, if the equation of a curve is represented in the 
form y — f( x), that is the argument x itself is regarded as a para- 
meter, then y = y r , y = y", x — 1, x = 0 and 



I iH 

(i + y ' a ) 3/2 


Fig. 178 shows that for a circle 


(37) 



Aq> 

As 


— lim 


Atp 1 
i?A<p ~~ ~R 


(38) 


which means that the curvature of a circle is constant and inverse 
to its radius. The only plane curves with constant curvature 
are circles and straight lines; the curvature of a straight line is 
equal to zero. 
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Let us take an arbitrary point M on a curve (L). The circle (K) 
passing through M and having the same tangent, the same curvature 
and the same direction of convexity as ( L ) is called the osculating 
circle (the “circle of curvature”) of the curve (L) at the point M 
(see Fig. 177). The radius and the centre of this circle are called the 
radius of curvature and the centre of curvature, respectively. Accord- 
ing to formulas (36)-(38) we have 

R _ 1 .. i I ds _ (* 2 +y 2 ) 3/Z _ (l+!/' 2 ) 3/Z / 39) 

^osculating circle kcurve <L) I dtp \’y X y x'\ I V* I 

A curve ( L ) is very close to its osculating circle near its point M. 
On the basis of Taylor’s iormnia (IV.50) we can show that the 
curve (L \, can be regarded as coinciding with its osculating circle 



L) 


Fig. 179 

in the vicinity of the point M with an infinitesimal error of the 
third order relative to At. This assertion, in its turn, implies that 
the osculating circle may be regarded as a circle passing through 
three points of the curve (L) lying infinitely close to each other. 
The last property also applies to curves in space. In particular, it 
straightway implies that the osculating circle lies in the osculating 
plane. 

The points of a curve at which the curvature assumes its extremal 
values (but not the points of inflection) are called the vertices of 
the curve. For instance, take the parametric equations x — a cos t 
and y = b sin t of an ellipse. Then 

j [ (— fcsin t) (— asin l) — (b cos l) (—a cos t) | ab 

[( — a sin I) 2 -)- (6 cos I) 2 ] 3 / 2 (a 2 sin 2 I-(-6 2 cos 2 t) 3 ^ 

* Jt 

Here the denominator has extremal values at t = 0, t = y , t — it, 

3' 

l = it etc. (check it up!). Therefore the vertices of an ellipse 
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understood in the new sense coincide with the vertices defined in 

Sec. II. 10. The radius of curvature is equal to. — at the points of 

a 2 

intersection of the ellipse with the z-axis and is equal to at the 


points of intersection with the y- axis. This result is applied to con- 
structing an approximate form of an ellipse: we draw the circles 
of curvature at the vertices by means of. a compass (see Fig. 179) 
and then connect the .circles by the lines drawn with the help of 
a French curve. For the sake of simplicity only a quarter of the 
ellipse is shown. The construction of the radii of curvature at the 
vertices is also demonstrated. 

To determine the coordinates g and r) of the centre pf curvature 
of a curve ( L ) at a point M let us suppose' that' the curve is convex 
downwards at M, that is y" >0 (see Sec. IV.20), as it is shown in 
Fig. 177. Then . ' _ - " 


• = x — R sin cp 


y 1 - 1 - tan 2 q> 


— x- 


(1 + y' 2 ) 312 V 


~[/l +!/' 2 


■■ x ■ 


y’ (l+i/' 2 ) 
y" 


ri -=y-f-/?cos<p = y + i? 

(1 -4-t/'2)3/2 


l 


-y +- 


f/l -f- tan 2 (p 
1 , 1+2 /' 2 


Vi+2 r- 


= y 


y 


(40) 


In case y"<c0 we obtain the same formulas. If we pass to the 
derivatives with respect to an arbitrary parameter we obtain 
(check it!) 


yi^+v 2 ) x{x 2 +ii 2 ) 

• •• •••7 * i y i "" • • • • • ■ 

yx—yx y x—y x 


(41) 


26. Evolute and Evolvent. A curve ( L ) given, consider the locus 
of centres of curvature of the curve (L) and denote it by (X). This 
locus is called the evolute of the curve. In its turn, the curve (L) 
itself jis called the evolvent (or involute) with respect to its evo- 
lute ( L ). 

Thus, if (L) is the evolute of (L) then (L) is the evolvent of (L) 
and vice versa. It can be shown that we have the following rela- 
tionships between an evolute and its evolvent. 

(1) Let M be an arbitrary point ofthe evolvent and M the corres- 
ponding point of the evolute; '.i.e. M is the centre of curvature of 
the curve (L) at the point M. Then the straight line MM is not 
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only the normal to the evolvent but also the tangent to the evolute. 
This relationship is illustrated in Fig. 180. 

(2) Let a point M move along the evolvent. Then the increment 
of the radius of curvature is equal to the length of the corresponding 
arc of the evolute lying between the centres of curvature (see 
Fig. 181). 

These properties imply the following characteristic feature of 
an evolvent: if an unstretchable taut thread is wound off the contour 
having the form of the evolute the end of the thread describes the 



(L) is the evolute, (A) is the evolvent 


evolvent. The involute (evolvent) of a circle is of practical impor- 
tance because the profiles of the lateral surfaces of teeth of most 
gear wheels are shaped in the form of the evolvent of a circle. 

If a curve (L) is represented in the parametrical form x = x (t) 
and y = y {t) then the parametrical equations of its evolute 
E = S (t) and p = p (/) (where E and p are the coordinates on the 
same x- and y- axis) are obtained from formulas (41) by substituting 
the expressions of x and y in terms of t into the formulas. 

To prove the two properties let us first diSerentiate equalities 

(40) (which hold if > 0): 

dE = dx — dR sin cp — R cos cp dcp, 

dp = dy + dR cos cp — R sin cp dcp 

But formula (39) implies 

dx — ds cos <p = — cos cp dcp = i? cos w dcp 

Y d <P Y r T (43) 

and, similarly, dy — R sin 9 dcp 

Substituting (43) into (42) we receive 

d| = — dR sin cp, dq= dR cos cp 


(44) 
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The same final result (44) would be obtained if we considered the 
case — < 0. 

ds ^ 

The immediate consequence of formulas (44) is 


that ■— 


— cot m = — . This means (see problem 5 in Sec. 

T dy 

dx 


1 1. 9) that 


the tangent to the evolvent at its arbitrary point M and the tangent 
to the evolute at the corresponding point M are mutually perpen- 
dicular (see Fig. ISO) which is just 
the first property. 

From formulas (44) we also deduce 
d£- + dq 2 = dR 2 . This implies (see 
the end of Sec. 23) dR = ±ds where s 
is the arc length reckoned along the 
evolute. Consequently, d (R =F s) = 0, 
i? + s = const, R = C rk s, A R = ±As 
and | A R | = | As |. This furnishes 
the proof of the second property of 
the evolute and the evolvent. 

It may he shown that to the verti- 
ces of a curve (see the end of Sec. 25) 
there correspond cusps of its evolute (see Fig. r 182). For example/ 
the evolute of an ellipse has four cusps. If a curve has a zero 
curvature at a point (in particular, such a situation usually occurs 
at points of inflection) the corresponding point of its evolute travels 
into infinity. 

As an example, let us determine the evolute of a cycloid [see 
formulas (11.12) in which we substitute t for i|j[. Here we have 

x — R (t — sin t), x — R (1 — cos t), x = R sin t; 
y = R (1 — cos t), y = R sin t, y — R cos t; 

x 2 + \f- = R" (1 _ cos tf + R 2 sin 2 t = 2R 2 (1 - cos /); 

* • • • • • 

yx ~ yx — R cos t-R (1 — cos t) — R sin t-R sin t = 

= — R 2 (1 — cos t) 

By formulas (41) we obtain 



£= R (t — sint) 


R sin t-2R 2 (1 — cos t) 


= /?(/ + sin t) 


and, similarly, 
we receive 


— R 2 (1 — cos t) 
fl = —R (1 — cos t). Now denoting t = jt -{- x 


H = R [jr + r + sin (jt + t)] = R{ t — sin x) + n R, 
i] = —R [1 — cos (it + t)] = R (1 — cos x) — 2 R 
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Since nR and 2 R are constants we see [comparing with formulas 
(11.12)] that the evolute of a cycloid is the cycloid of the same sizes 
but translated nR units of length in the positive direction of the 
2 -axis and 2 R units of length in the negative direction of the p-axis 
with respect to the original cycloid. This fact is sometimes utilized 
in engineering. 

It also follows from Fig. 181 that an evolvent is a special case 
of a roulette (see Sec. 1 1. 6) described by a point of a straight line 
when the line rolls upon the evolute. Incidentally, this result shows 
that the arc of any curve can serve as a roulette, that is roulettes 
do not form a special class of curves. 


CHAPTER VIII 


Complex Numbers 
and Functions 


Complex numbers are widely used in modern mathematics and 
its applications. It turns out that it is convenient to obtain many 
relationships between real quantities by using complex numbers 
and functions in intermediate calculations. 

§ 1. Complex Numbers 

1. Complex Plane. The definition of a complex number is well 
known from elementary mathematical courses. A complex number 
is an expression of the form 

z = x -f iy (1) 

where x and y are real numbers and i is the imaginary unit satis- 
fying the equality i 2 = —1; x is called the real part and y the ima- 
ginary part of the complex number z. For the real and imaginary 
parts we shall use the notation x = Re z and y — Im z (the term 
“imaginary part” is sometimes applied to the whole product iy 
which is more natural but less convenient). Two complex numbers 
are equal if and only if their real parts are equal and their imaginary 
parts are equal: if = x t -J- iy l and z s = x 2 + then the equa- 
lities 

Z; = and } ( 2 ) 

!/i = y 2 J 

are equivalent. Hence, one “ complex equality" is equivalent to two 
real equalities. Signs of inequality cannot be applied to complex num- 
bers. that is inequalities of the form z t > z„ do not exist. 

Complex numbers may be represented in a plane. To do this we 
take a Cartesian coordinate system x. y which enables us to repre- 
sent any number of the form (1) as a point M (x; y). Such a plane 
is conventionally called a complex plane but, of course, all the 
points of the plane have real coordinates. For brevity we often say 

17 * 
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he point x + iy" instead of “the point corresponding to the number 
-f- iy" (see Fig. 183). 

Real numbers are a special case of complex numbers: if we put 
= 0 in formula (1) we may regard a complex number x i - 0 
^representing the real number x; real numbers are represented as 
lints lying on the real axis (axis of reals), i.e. on the r-axis. Corn- 
lex numbers that are not real are called imaginary. Thus, every 
implex number is either real or imaginary. A complex number 
ithout a real part (i.e. with the real part equal to zero) is called 




ire imaginary; such numbers are represented by points of the 
axis which is called the imaginary axis or the axis of imaginaries. 
queer thing in the terminology is that the number z — 0 is a pure 
aaginary number but not an imaginary number! 

It is often convenient to introduce polar coordinates in a complex 
ane (see Sec. II. 3 and Fig. 183). The polar coordinates p and cp 
the point M ( x ; y) which represents the complex number z = 
= x -f- iy are denoted as p = | z | and cp = arg z. p is called the 
lodulus or the absolute value of z and cp is called the argument, or 
mplitude, or phase, of z. 

As is known, p — y x 2 + y 2 , x = p cos cp and y = p sin cp. This 
nplies, by (1), that 

z = p (cos cp + i sin cp) (3) 

ence, each complex number can be written in the so-called tri~ 
mometric form (3). The modulus of a complex number is a certain 
liquely defined non-negative real number whereas the argument 
defined within an integral multiple of 2it. For instance, ] i | = 1 

id Arg l = -2- -f- 2kn ( k = 0, ±1, ±2, . . .). The sign Arg z 

motes the totality of all the possible values of the argument of 
complex number z. Thus, Arg z has infinitely many different 
ilues. There is, however, one and only one value of Arg z, denoted 
; arg z, which satisfies the inequality — 180° < arg z ^ 180°; 
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arg z is called the principal value of the argument. Besides, any 
arbitrary value can be taken as the argument of the number z — 0. 

2. Algebraic Operations on Complex Numbers. To carry oat the 
addition of complex numbers we should sum their real parts and 
their imaginary parts separately. Comparing this rule with the 
rule of addition of vectors [see formula (VII. 10)] we see that the 
addition and subtraction of complex numbers are performed in the 
same way as the operations on vectors (see Fig. 184). In particular, 
it follows that 

I + z 2 I < I Z 1 I + I z 2 I 

Similarly, it can be verified that complex numbers are multi- 
plied by real numbers in the same way as vectors. These properties 
make it possible to interpret complex numbers as vectors in the 
complex plane. The connection between the representation of com- 
plex numbers as points of the complex plane and their vector repre- 
sentation is obvious: if the vector is drawn from the origin of the 
coordinate system its terminus is at the corresponding point repre- 
senting the complex number. 

The rule of multiplication of complex numbers is quite different 
from that of vectors. If we use the trigonometric form 

Zi = pj (cos <pj + i sin cpj), z 2 = p 2 (cos <p 2 + i sin qp 2 ) 
then the product z — z t »z 2 can be written as 

z = p (cos <p + i sin <p) = pj (cos <pj + i sin qij) p 2 (cos (p 2 -|- 
+ i sin cp 2 ) = p t p 2 (cos cpj cos cp 2 + i cos (pj sin <p 2 -f- 
+ i sin (pi cos cp 2 — sin q>j sin ip 2 ) = 

= P 1 P 2 [cos ((p 4 -f- qp 2 ) -f- i sin ((pj -f- tp 2 )] 

Therefore, . 

P = PiPzj T — <Pi "T ^2 
i.e. 

I Z 1 Z 2 I = I Z 1 ! • i z 2 I. Arg (zj -z 2 ) = Arg z t + Arg z 2 

Hence, when complex numbers are multiplied their moduli are mul 
tip lied and their arguments are added. This implies that for the inverse 
operation, the division, we have 

I f | = f§f ’ Ar S |r = Arg Zj - Argz 2 

The multiplication of a complex number by i is of special inte- 
rest: | iz | = | z | and Arg iz == Arg z + ~ • since | i | = 1 and 

TC * 

ar g i = ; thus, the vector iz is obtained by turning the vector z 

in the positive direction through a right angle. 



262 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


The rule of multiplication of complex numbers extends automa- 
tically to an arbitrary number of factors. In particular, if we take 
equal factors then we have 

[p (cos q) + i sin q>)] n = p” (cos mp -f- i sin ntp) 

(n — 2, 3, . . .) 

In case p = 1 we obtain the formula 

(cos (p ”T i sin <p) n = cos mp + i sin mp 

which was named De Moivre’s formula after the English mathema- 
tician A. Moivre (1667-1754) who discovered the formula in 1707. 

De Moivre’s formula can be applied to express the values of tri- 
gonometric functions of multiple arcs. For example, taking n — 3 
we get 

(cos (p + i sin (p) 3 = cos 3 9 + i 3 cos 2 9 sin tp — 

— 3 cos <p sin 2 9 — i sin 3 9 = cos 3cp + i sin 3qp 
which implies, by formula (2), that 

cos 3cp = cos 3 cp — 3 cos 9 sin 2 9, sin 39 = 

= 3 cos 2 9 sin 9 — sin 3 9 

Here, of course, we should take into account the table of powers 
of the number i : i 1 — i, i 2 — — 1, i 3 = — i, i 4 = 1, i 5 = i, i 6 = — 1 

etc. By the way, note that — = — i. 

Now we turn to extracting roots of complex numbers. If z — 
= p (cos 9 + i sin 9) is given and z — w — r (cos ip + i sin op) 
is to be found, then, by the definition of a root, z — w n = 
= r n (cos mp -f- i sin mp) . Comparing this with the original expres- 
sion of z we conclude (see the end of Sec. 1) that 

r n — p, mp = 9 -f- 2A - n (k is an arbitrary integer) 

Since r and p are non-negative numbers, we have r = (y^p) 0 rd 
and ip == ^ where (/X p) ord denotes the “ordinary" (i.e. the 

arithmetic, real positive) root of a non-negative number. Thus, 

to = {y^p)ord (cos + i sin 2 + * kn ^ 

Making k assume the values 0, 1, 2, ... we shall get all the 
values w u w 2 , w 3 , ... of the root. But for k = n we obtain 

»n+i= V~P)o rd (cos A- i sin = 

= (v^dord [cos (-|- + 2 Ji) + i sin (-|- -f 2n) ] = 

= {V p)o rd (cos 2. i sin -2.) = Wi 
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Similarly, m n+2 = etc. For negative k we do not obtain any 
new values either: k — — 1 yields the same result as k n 

etc. Finally, 

[/ p (cos <p -j- i sin q>)]/,+i = 

(A = 0, 1, . (4) 

We see now that the nth root of a complex number has n diffe- 
rent values; the number z = 0 makes the only exception to this 
rule since all the roots of 0 are equal to zero. 

For example, since 2i — 2 ^cos 4p+ i sin-y^ , we have 
( -f+2 kn -~f-2A-Jt\ 

(j/ r 2 j)i t 2 = (K2)ord^cos -^— 2 f/sin-=— g J (* = 0, 1) 

which enables us to calculate (|^2i)j = 1 -f - i and (]^2 z') 2 = — 1 — i 
easily. 

Consider another example: 

(^ / l)i,2,3 = (>^l)ord (cos^-f tsin^-) (A* = 0, 1,2) 


since 1 = 1 (cos 0 -}- i sin 0) . This implies Q/ l) t = 1 , (fi r 1)2 = 
= — and (^ 1)3 = — ~ — i . In this case one of the 


n 


roots has turned out to be ordinary, real whereas the other two 
roots are imaginary. 

The geometrical meaning of formula 
(4) is illustrated in Fig. 185 where we 
have taken n = 5. 


number z* = x — iy is called conjugate 
to the number z = x -j- iy; we often write 
z instead of z*. The simple properties of 
conjugate numbers are the following: 

(1) (z*)* = (x - iy)* = [x+ i (-y)]* 
= x — i (— y) = x + iy = z, i.e. the 
numbers z and z* are mutually 
conjugate; 

(2) z - {- z* = 2Re z, z — z* = 2 i Im z; 

(3) z* = z if and only if z is real; 

(4) zz' — (x iy) (x -f- iy) = x~ -j- y~ = 

' 2 I 2 ; 


II 


m 


V 

\\ ?/ s j a 


■^4 


Fig. 13; 


(5) | z* | — | z |, Arg z* = — Arg z, i.e. the points z and z* 
are symmetric with respect to the real axis; 
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(6) (Zj + z 2 )* = z* + z* since 

(Zj + z 2 )* = + iiji + Xo + iy z )* = 

= [x t + x 2 + i (yi + y 2 )]* = x t + x 2 — i (y t + ,V 2 ) = 

= — R/i) + (* 2 — ij/ 2 ) = z? + 2* 

(7) (ziz,)* — z*z* which can be verified in the same way as pro- 
perty (6). 

If we substitute — for z t into the formula expressing property (7) 

Jc ~ 2 

we get z* = j z* which yields 

Properties (6) and (7) extend automatically to any arbitrary num- 
ber of summands or factors. For instance, 

(z n )* = (z*) n , 

(2z n + iz"')* = (2z n )* + (tz m )* = 2 (z*) n — i (z*) nt etc. 

Generally, to pass from any rational expression containing an 
arbitrary number of variables and coefficients to the conjugate 
expression it is necessary to replace each variable and each coefficient 
by its conjugate value. It can be shown that this rule holds not only 
for rational expressions but also for irrational expressions, for sums 
of power series and so on. It follows that each equality involving 
complex expressions of the above type remains true when —i is 
substituted for i everywhere in the equality because the substitution 
transforms the original equality of complex numbers into the equa- 
lity of their conjugate numbers. The numbers i and — i are there- 
fore indistinguishable in the algebraic sense. It would be incorrect 
to say that i = Y — 1 and — i = — p — 1. In fact the root y —1 
simply has two different values designated as ±i. 

Conjugate numbers can be used, in particular, to separate the 

real and the imaginary parts of a fraction of the form — = 1 l JLi . 

-2 iy 2 

To do this we multiply the numerator and the denominator by zf 
and obtain the fraction with the real denominator which enables, 
us to perform the separation easily. 

For example, 


Re 


3 — z'2 


= Re 


Re- 


(3 — i2) (3+12) 

Re (~r3 +{ ' S) = - 


13 


_4 

13 


4. Euler’s Formula. Now we turn to the transcendental opera- 
tions on complex numbers. In Sec. IV. 16 we showed that for real x, 

e = 1 -f- -jy x x- . . , (5) 
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If we substitute z for x we shall come to the definition of the exponen- 
tial function with a complex exponent', it is defined as 


e : = 1 


1 

1! 


1 

21 


3! 


, ( 6 ) 


We shall show in Sec. XVII. 14 that this definition can be justified, 
for all z and besides the basic property of the exponential function 
remains true: 

e z 'e z '- — e z ‘+ z '- . (7) 

Formulas (6) and (5) show that in the special case when z is real 
the new definition of e z coincides with the old one; in general, each, 
new definition must not contradict the facts already established. 
At the same time formula (7) confirms the expedience of this defini- 
tion of e z . 

In the same standard way we can define, for the complex values 
of the argument, functions / (x) which were originally defined for 
real values of the argument. For this purpose we should expand 
a given function / (x) into Taylor’s series in powers of x or in powers 
of x — a where a is a real number and then replace x by z and denote- 
the sum of the series by / (z). Thus, by analogy with (6), we write, 
using formulas (IV.56) and (IV.57), for complex z, 


sin z — z - 


"i - t;i 


3! 1 5! 

-2 -4 

IT+TT 


7! 


cos z — i — ir+'TT ^r+ • • • 


( 8 ) 

( 9 )' 


etc. In Sec. XVII. 14 we shall show that all the basic formulas which 
have the form of identical equalities and hold for real values of 
the argument (for instance, such as sin (—x) = —sin x or sin 2 x + 
+ cos 2 x = 1 etc.) remain true for the complex values of the argu- 
ment. 

The above formulas reveal the essential relationship between the- 
exponential function and trigonometric functions. Namely, sub- 
stituting iz for z into (6) we deduce 


et ~ = ( i+ ~w iz ~-ir z2 --w iz3+ 4r zi +-jr iz& -ir zG - 

-jrizl - = ^ r2 2 + ^_ z 4_^_ z 6 + .. > ) + 

+ i (TT z -ir z3 + 4 - z 5 -TT z7 +-*-) 

This, together with (8) and (9), implies Euler’s formula 


e 1 ' = cos z -{- i sin z 


( 10 > 
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■which is very important. The formula 

e~ u _ e »(-‘) = cos ( — z ) _j_ i S i n (_ z ) = cos z — i sin z 
-and the formulas 

. e iz_ e -iz 

cosz = ^ and sinz = ^ (l 1 ) 

-which are implied by the above formula and (10) are also often 
used. All the formulas were discovered by Euler in 1743. 

From Euler’s formula (10), using property (7), we obtain the follow- 
ing expression for the exponential function with an arbitrary com- 
plex exponent: 

e z = e x+w = e x e iy — e x (cos y -f- i sin y) (12) 

The comparison with trigonometric form (3) shows that 

| e : | = e*, Arg e : = y + 2fcrc (13) 

In particular, it is seen that we always have | e : | > 0, i.e. e z ^ 0. 
If we write z instead of e : in formula (12) then, by (13), we obtain 

z = | z | (cos arg z -f- i sin arg z) = | z | e' ar » z = pe'v 


This “exponential form” of complex numbers is convenient for 
performing algebraic operations on them. 

Formulas (11) imply the following relationships between tri- 
gonometric and hyperbolic functions (see Sec. 1.28): cos z = cosh iz 

and sin z = sir ^ — , that is sinh iz = i sin z. From this, substi- 
tuting iz for z, we also deduce cos iz = cosh z and sin iz = i sinh z. 

It is these formulas that reveal the essential relationship between 
the functions (the relationship was mentioned in Sec. 1.28) which 
•enables us to transform relations between trigonometric functions 
into the corresponding relations between hyperbolic functions and 
vice versa. (Let the reader deduce the basic relation between cosh z 
and sinh z by substituting iz for z into the formula cos 2 z -f- 
-f- sin 2 z = 1.) 

Using formulas (11) we can also obtain the expression of powers 
■of sine and of cosine in terms of trigonometric functions of multiple 
arguments. For instance, 


( e* x 4 - e~* x \ 3 e ^ x + 3e ix 4 - 3e~' x + «-» 3 * 

X ~{~2 ) ~ 8 = 

e i3x_^. e ~i3x 3 , i* _j x . cos 3x . 3 cos x 

8 +8< e ) = 


(14) 


■etc. Transformations of this kind are used for integrating trigono- 
metric functions. 

5. Logarithms of Complex Numbers. The definition of “complex 
logarithms” is essentially the same as that for logarithms of real 
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numbers: the logarithm of a complex number z is the number w 
for which z = e w . To find the value of the logarithm let us denote 

z = p (cos qp -f- i sin qp), w — u + iv 
Then we deduce from formula (12): 

p (cos qp + i sin qp) — z = e w — e u (cos v + i sin v) 

Since u and v are real this implies 

e u = p, i.e. u — In p; v — q> -f 2kn (k is an integer) 

where In p is understood as an “ordinary”, real logarithm of a posi- 
tive number. Thus, 

Ln z = u? = u-!-iu = In p-j-iq) + i2kn = In | z | + i Arg z 

where Ln z denotes the totality of all the values of the logarithm. 

Hence, a logarithm of a complex number has an infinite set of 
different values. The number “zero” is the only exception to the 
rule because it has no logarithm. We can write conditionally (for- 
mally): Ln 0 = — oo -f- iv where v is arbitrary. 

Real positive numbers being a special case of complex numbers, 
their logarithms are also infinite-valued. One of the values is “ordi- 
nary”, real, whereas all the others are imaginary. For example, 
Ln 1 = ln 1 -f- iO -f- i2kn = i2kn ( k = 0, ±1| ±2, . . .) 

For k — 0 we get the original value ln 1 = 0 but at the same time 
we can take for the logarithm of 1 the values i2n, — i2n, i4n etc. 
Let us check it up once again: 

e i2kn = cos 2 ksi + i sin 2/cit = 1 + £0 = 1 (15) 

( k = 0, il> ifc2, . . .) 

Negative real numbers also have logarithms but all their values are 
imaginary. For example, Ln (—1) = in, (2k + 1) (check it up!). 

By means of logarithms we raise a complex number to an arbi- 
trary complex power: the involution is defined as 

2^2 — ^gLnzj^za _ gZ2 Ln zj 

and the right-hand side is calculated by formula (12). Since loga- 
rithms are infinite-valued the whole power is also infinite-valued 
■in the general case. 

<J> 2. Complex Functions of a Real Argument 

6. Definition and Properties. It is sometimes necessary to deal 
with functions which assume complex values although their inde- 
pendent variable, the argument, is real. As examples of such func- 
tions we can take 

(1) z — (t + i) 3 ; (2) z = Me pi (p = a + tco) etc. 
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Here the independent variable is denoted by t and the functions- 
are denoted by z. If we decompose the value of such a function into- 
its real and imaginary parts, i.e. z = x + iy, then each of the parts 
will be a function of f, thus, in the above examples we obtain 

(1) x = t 3 — 3t, y = 3t 2 — 1; (2) x = Me at cos at, 

y = Me at sin at 

(verify these relations taking into account that M is real!). 

In the general case, if 

z = / W = <p (4 + 4 (t) ( 16 > 

we obtain 

x — <p (t), y — i|> ( t ) (17> 

Conversely, (17) yields (16). The indication of a complex function 
of a real argument is therefore equivalent to the indication of two 
ordinary, real, functions of the argument. This situation is quite 
analogous to the case when we have a vector function of a scalar 
argument (see Sec. VII.23). The analogy becomes still more complete 
if we interpret complex numbers as vectors (see Sec. 2). 

It follows that the theory of complex functions of a real argument 
does not involve any essentially new features in comparison with 
the theory of real functions. In particular, the definitions of the- 
continuity, of the derivative etc. are transferred without any changes. 
From the considerations in Sec. XVII. 14 it follows that all the 
differentiation formulas remain true. For instance, 

[( t + i) 3 ]' = 3 (t + i) 2 , ( Me pt )' = Mpe v> and so forth 

A function of form (16) is represented by a curve in a complex plane- 
which has parametric equations (17). 

When using functions of form (16) the following obvious pro- 
perties should be taken into account: 

if complex functions are added up their real parts and their ima- 
ginary parts are added separately; 

if a complex function is multiplied by a real constant or by a real 
function the real and the imaginary parts are also multiplied by the 
same factor; 

if a complex function is differentiated the same operation is per- 
formed on its real and imaginary parts. 

The properties are expressed by the formulas 

Re [fi ( t ) + / 2 (<)] = Re /j (t) 4- Re / 2 ( t ) etc. 

(Deduce the formulas!) 

These properties make it possible to perform the above opera- 
tions on the whole complex function and then to take the real or 
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the imaginary part of the resulting expression instead of performing 
the operations on the real or on the imaginary part. It is remarkable 
that such a transition to complex quantities together with the reverse 
transition to the real quantities which are sought for sometimes 
turns out to be more obvious and simpler than the corresponding 
•direct operations on the real quantities. 

Here we also mention a function of the form 

Ln[f — (a-f f&)] = ln|£ — a — ib\ + i Arg [t — a — ib\ = 

= 4-ln l(i — a) 2 -ri> 2 ] -M [arc tan — fatj 

The function is sometimes of use for certain applications when the 
integer k is chosen in an appropriate way. 

7. Applications to Describing Oscillations. Let us take the function 

U ( t ) = Me' (®*+«0 = M cos (at -j- + 01/ sin (at + a) 

(18) 

(M >0, co > 0) 

This function is convenient to apply for investigating harmonic 
•oscillations (compare with Sec. 1.29). For this purpose it should be 
noted that expression (18) has the modulus M and the argument 
at + a, i.e. it is represented by a vector of constant length which 
rotates uniformly with the angular speed co. 

For example, let us consider the superposition of oscillations 
with equal frequencies. Let it be necessary to sum up the two quan- 
tities Uj (t) = sin (co£ + a t ) and u 2 (t) — j]/ 2 sin (at + a 2 ). To 
do this we introduce the corresponding complex quantities U i (t) — 
= M L e' C“*+“i) and U* (t) = M 2 e' (<ot +“ 2 ), an d u„ being, res- 
pectively, their imaginary parts. The vectors U j (t) and U 2 (t) 
rotate uniformly with the angular speed co and therefore the vector 
U t (t) + U z (t) rotates uniformly with the same speed. Hence this 
vector can be represented in foTm (18). To find M and a it is suffi- 
cient to consider the situation at the moment t — 0 (see Fig. 186). 
The figure shows that projecting on the coordinate axes we obtain 

M cos a = Mi cos -f il/ 2 cos a 2 1 

M sin a = M ± sin + M 2 sin a 2 J 

Taking the imaginary part of U (t) we finally conclude that u { ( t ) + 

+ u 2 (t) = M sin (at + a) where M and a should be found from 

equalities (19) (find them and explain the corresponding formula 
for M by means of Fig. 186). 

A similar result follows when we have the superposition of an 
arbitrary number of harmonic oscillations with the same frequency, 
the superposition of oscillations having different frecruencies will 
be discussed in Sec. XVII. 23. 
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The advantage of exponential function (18) over the trigonometric 
functions is particularly well seen when we differentiate: 

<E = Me i w+*) i(0 = iau ( 20 ) 

By Sec. 2, we again obtain a vector which rotates uniformly with 
the angular speed a but it leads U by 90° and has a modulus which 
is oi times that of U. The rotation and the stretching are repeated 
when the differentiation of U is continued. 

Now we are going to demonstrate an application of functions of 
form (18) to an electric circuit shown in Fig. 187. There is a resi- 
stance R and an inductance L in the circuit. If an alternating 




potential difference which varies according to the law <p — 
= <p 0 sin (oi t + p) is applied to the terminals of the circuit there 
appears a steady-state alternating electric current flow also varying 
harmonically: = ; 0 sin {ot -f- a) but ; 0 and a are not known be- 
forehand. Equalling <p to the resulting voltage drop on R and L we 
deduce, using the well-known physical laws, the basic equation 

of our problem: Rj + L ~ = cp. 

Let us introduce the notion of a complex voltage and that of a 
complex current by formulas ® = tp 0 e l (“*+&) and J =; 0 e 1(ot+a) . 
The “real" voltage and current are equal to the imaginary parts 
of the corresponding expressions. By the properties described in 

Sec. 6, to find j we must solve the equation RJ = <J> and 

then take the imaginary part of the resulting expression. According 
to formula (20) we receive 

RJ + LiaJ = <D, i.e. J = -^ L (21) 

We see that the inductance L can be interpreted as a certain resi- 
stance equal to iaL; this quantity is called the impedance of the 
unit L. Writing the expression (R + itoL) -1 in the exponential form 
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{R + itoL)" 1 = re" 10 we deduce from (21) (verify it!): 

j 0 = rq> 0 = <Po (-S 2 + co“Z/“) 2 , cc = P arc tan (22)> 

(Let the reader try to derive the expression for the vector / 0 e ,a = 
= J | i=0 by using the first equality (21) and the geometric method, 
and then deduce formula (22) from the expression.) 

§ 3. The Concept of a Function of a Complex Variable 

The theory and the applications of complex functions of a com- 
plex variable contain many new ideas and facts in comparison with 
the theory of functions of a real argument. The theory is dealt with, 
in many books. To a beginner we should recommend [15], [40], 
[44, Vol. 3, Part 1]. Here we shall give only some simple facts which 
are directly related to the subject matter of our course. 

8. Factorization of a Polynomial. Let us take an entire rational 
function of the argument z = x + iy, that is a polynomial 

P ( z ) — OqZ 71 -f- CjZ 11-1 — {— ... — -f- a n (flo 0) (23) 

of the nth degree with certain coefficients a 0 , . . ., a n which can be 
complex in the general case. In the books mentioned above the 
reader can find the proof of a remarkable theorem, namely the “funda- 
mental theorem of algebra” which asserts that every polynomial of 
degree n~f>> 1 has at least one complex zero, i.e. there exists a root 
of the equation 

P(z)=0 (24) 

which can be either real or imaginary. (The theorem was proved by- 
D’Alemhert in the middle of the 18th century. A more rigorous 
proof was given by Gauss at the end of the 18th century.) If we denote 
one of the roots by Zj then, as It is proved in elementary courses 
on algebra, P (z) is divisible by the binomial z — Zj, that is P (z) = 
= (z — z,) ( z ) where P t (z) is a polynomial of the (n — l)th_ 
degree. If we repeat the argument for (z) we shall get P x (z) = 
= (z — z ; ) P 2 (z). i.e. P (z) = (z — z^ (z — z„) P 2 (z) where P„ (z> 
is a polynomial of degree n — 2. These considerations can he con- 
tinued up to the “polynomial of degree zero”, i.e. a constant, and 
thus we receive 

P (") a o ( z z i) ( z Zo) . . . (z — z n ) (25) 

This formula shows that all the numbers z 1? z„, . . ., z n are zeros 
of the polynomial P (z) and that it has no other zeros. 

Thus, an algebraic equation of form (24) of the nth degree has 
exactly n roots. 

Some of the roots of equation (24) may coincide, i.e. they may be 
repeated. Such roots are called multiple (double, i.e. of "multipli- 
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•city 2, or triple, i.e. of multiplicity 3 etc.) in contrast to simple 
roots which are not repeated. When figuring the number of roots 
we should reckon each root according to its degree of multiplicity. 

A simple test for distinguishing the multiplicity of a root is 
implied by Taylor’s formula for a polynomial (IV. 46) which, as we 
.shall show in Sec. 11, remains true for a polynomial in a complex 
variable. Suppose z — a is a root of equation (24). Then P (a) = 0 
and the formula implies that if 

P {a) = P' (a) = ... = P&- 1) (a) = 0, PW (a) ^ 0 (26) 

then the polynomial P (z) is divisible by (z — a) h and is indivisible 
by z — a to any power higher than k. The value z = a is therefore 
a root of equation (24) of multiplicity k. 

If P (z) is an arbitrary function, even a transcendental one, and 
relations (26) hold for a value z = a then z — a is also called 
a root (a zero) of equation (24) of multiplicity k. (In parti- 
cular. if P (a) = 0 and P' (a) 0 the value z = a is a simple 
root.) In the case of an arbitrary P (z) we conclude, by Taylor’s 
series (IV.53), that for a A-tuple root z = a the ratio P (z)/(z — a) k 
has a finite limit, as z a, which does not equal zero. Hence, the 
ratio has a removable discontinuity at z — a (see Sec. III. 13). Thus 
we can write 

P ( z ) - (z - a) h = (z - a) h Q (z) 

(=— a r 

where the function Q (z) remains continuous for z = a and Q (a) 0. 

For example, the value x = 0 is a triple root of the equation 

x — sin x = 0 

since {x — sin x) |. T=0 = 0, (1 — cos x) | K=0 = 0, (sin x) | x==0 = 0, 
and (cos x) | x=0 = 1. 

If we combine equal factors in the factorization (25) we obtain 
p {*) =a 0 (z- z t )a. . . . (z - z h ) a h (27) 

where z lt . . ., z h are all the pairwise different roots of equation 
(24) and a j, . . ., a h are their multiplicities. Factorizations (25) 
and (27) hold for polynomials (23) with real and complex coefficients 
as well. 

If polynomial (23) has real coefficients then, together with every 
complex root, it has the conjugate root of the same multiplicity. 
Indeed, if P (z m ) = 0 then [P ( z m )]* = 0* = 0. But, by Sec. 3, 

IP (z m )l* = [ooz’m + - . . + a n ]* = a* (z%) n -f . . . + at = 

= flo (Zm) n + . . . + a n = P (z*) 
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from which it follows that z = z* m is also a solution of equation (24). 
In case z = z m is a double root of equation (24) we have, in addi- 
tion, P' (z m ) = 0 which implies, in a similar way, that P (z m ) = 0 

and so on. . .. 

Combining the factors which correspond to a pair ot mutually 
conjugate roots of the form cc i iP in factorization (2/) we obtain 

[z — (a + ip)] [z — (cc iP)] = (z — a) 2 + p 2 = 

= z 2 4- pz + q (p = —2a, q = a 2 -f p 2 ) (28) 

Such combinations are used for factoring polynomials with real 
coefficients depending on real argument. It is natural to denote the 
independent variable as x, and thus we obtain from (27) the expres- 
sion 

P (x) = a 0 (x — Xi) a l . . . (x — X r ) ar (x 2 + PiX + . . . 

. . . (x 2 + p s x + Qs ) Ps (29) 

where the first r parentheses correspond to the real roots and the 
last s parentheses correspond to s pairs of conjugate imaginary roots. 
Since p and q are real we conclude that every real polynomial (i.e. 
a polynomial with real coefficients) can he factored into real linear 
and quadratic factors. If all the roots are real there are only linear 
factors in factorization (29) and if all the roots are imaginary then 
there are only quadratic factors in it. The exponents a { , . . ., a r , 
Pd . . ., p s are equal to the multiplicities of the corresponding 
roots; in particular, they equal unity for simple roots. 

9. Numerical Methods of Solving Algebraic Equations. To realize 
factorizations (27) and (29) it is necessary to solve equation (24), 
that is an algebraic equation of the nth degree. The solution for 
n — 2 is well known from elementary mathematical courses. In 
textbooks on higher algebra there are formulas for the solutions for 
n = 3 and n = 4 which were discovered as early as the 16th century. 
But the formulas are so complicated that they are almost never 
used for practical purposes, especially for n — 4. In case n > 4 
there are no general formulas which express solutions in terms of 
the coefficients of the equation by means of algebraic operations 
on the coefficients. The non-existence of such formulas was proved 
by Abel and by £. Galois (1811-1832), a French mathematician, 
who created the fundamentals* of modern algebra. 

But algebraic equations can be solved approximately to within 
any degree of accuracy! In § V.l we described some methods of 
calculating real roots of equations of form / (x) = 0. To find ima- 
ginary roots we can use Newton’s method [see formula (V.7) in 
which we may regard x as an imaginary quantity] or an- iterative 
scheme (see Sec. V.3). It is also possible to substitute z = x -j- iy 
into equation (24) and then separate the real and the imaginary 

1 8-0141 
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parts: 

P (z) = P (x + iy) = Q (x, y) -f iR (x, y) 

This reduces equation (24) to a system of equations of the form 

Q {x, y) = 0 1 
R (x, y) = 0 J 

which can be solved with the help of methods discussed in 
Sec. XII. 12. These methods are also described in [3], [6], [10] T 
[33] and [42], 

There exist methods that are only applicable to algebraic equa- 
tions. We shall give here a method which was introduced by several 
authors and, among them, by N. I. Lobachevsky in 1834. 

Every algebraic equation can be written in the form 

z n -f + a 2 z n ~ 2 + . . . + a n = 0 (30) 

with the coefficient in the highest power equal to unity (why is it 
so?). Separating the even and the odd powers we get 

z n -f a 2 z n_2 + . . . = — fljz" -1 — a s z n ~ 3 — . . . 

If now we square both sides of the last equality we obtain an equation 
containing only even powers of z. Therefore, denoting z 2 — p we 
receive an equation of the rath degree for p (why is it so?) whose roots 
are the squares of the roots of equation (30). If we then transform 
the equation in like manner and put p 2 — q we shall arrive at an 
equation of the nth degree with the roots equal to the fourth powers 
of the roots of equation (30) etc. 

After several transformations of this type the roots with the 
greatest moduli become the most important. For instance, if equation 

(30) has the roots Zj = 2, z 2 — — 1 and z 3 = y the next equation 

j 

has the roots p t = 4, p 2 = 1 and p 3 = y . The equation following 

A 

the above equations will have the roots — 16, q z = 1 and q 3 = ^ 

etc. After m transformations are carried out we arrive at an equation 
of the form 

y n + c iV n-1 + . . . + c n = 0 (31) 

and its roots, yet unknown, are equal to — zf 11 , . . ., v n = z\ m . 
Therefore, by (25), equation (31) has the form 

(v - zf 1 ) ...(v- zD = o 

Now let us substitute v = 10 ! u into equation (31) and let us 
choose the integer l in such a way that in the resulting equation 
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for v, after the division by the highest_coefficient 10 nI , the greatest 
of the moduli of the coefficients in y n_1 , u n ~“, . • • > 1 should be 
of the order of 1, that is the greatest modulus is neither too large 
nor too small. Then, since the equation thus obtained, after fac- 
toring, must have the form 

(v — vj ... (v — v n ) -- 0 (where v i = 10 = 

= 10 J *r etc.) (32) 

we conclude that the greatest root (or roots) of equation (32) is 
(are) of the order of 1 whereas all the other roots are negligibly 
small. Omitting these roots, i.e. equalling them to zero, we thus 
delete in the equation for v the terms with too small coefficients. 

Solving the equation for v obtained after the deletion we find 
approximate values of the roots having the greatest moduli. Then 
turning back to z we -receive approximations to the roots of equation 
(30) with the greatest moduli. The greater m, the greater the accu- 
racy of the approximations. There appears a difficulty in the transi- 
tion from v to z connected with the extraction of a root with the 
index of radical 2 m . If equation (30) has real coefficients and if we 
get only one root with the greatest modulus in the equation for v 
then equation (30) will also have only one root with the greatest 
modulus and it will be real (why is it so?). Hence, in this case we 
can limit ourselves to calculating only two real values of the root. 
But if these conditions do not hold we have to extract the root 
according to the rules of Sec. 2 and thus obtain many possible values 
of the root. To determine which of the values should he taken we 
may substitute them all (in succession) into equation (30) and thus 
verify which of the roots satisfies the equation in the best^way. 

It is also possible to use the well-known relations between the 
roots of an algebraic equation and its coefficients. To deduce these 
relations it is necessary to compare the coefficients in equal powers 
of z in formulas (23) and (25) removing the parentheses in (25). 
This yields 

z i z 2 ~h • ••-{- z n = -~- 

a 0 


z l z 2 + Z i Z 3 + . . . + Zn-jZn = — 

“0 

(all the possible products of two factors) 


(33) 


fl 


Z 1 Z 2 ... Z„ = ( — l) n i" 

After the root y^of equation (31) has been found it is possible 
to eliminate the binomial v — iq from the equation by dividing 
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the equation by the binomial. Then in like manner we can find the 
next root etc. It is often possible to find the roots by applying rela- 
tions (33) to equation (31). Doing this we should take into account 
that in case the roots of equation (31) differ greatly in the values of 
their moduli it is permissible to neglect certain summands in the 
left-hand sides of relations (33). 

When the roots of equation (30) are determined approximately 
it is possible to apply some of the iterative methods described 
in § V.l, for example, Newton’s method (Sec. V.2), to make the 
approximations more accurate. 

Now let us consider as an example the equation z 3 -f- z 2 — 3 = 0 
already solved in Sec. V.2. The successive transformations according 
to the Lobachevsky method yield 

p 3 - p 2 + 6p - 9 = 0, g 3 — llg 2 + 18g — 81 = 0, 
u 3 - 85u 2 + 2106u — 6561 = 0 


(verify it!). Here we stop the process and make the substitution 
u = 10 l u which results in 


-s 85 -2 , 2106 - 

u u -j -u ■ 

i0 l 10 2 ' 


6561 

10 3 * 


(34) 


In this case we must take 1 — 2 (verify it by choosing another value 
of 1). Then the absolute term in equation (34) becomes small and 
may be omitted. This implies 


u 2 — 0.85n -f- 0.2106 = 0 (35) 

Hence, fij and i> 2 , and therefore z i and z 2 , are pairwise conjugate 
imaginary numbers. Further, since uiU^ = 0.2106 we have u ) u 2 = 
= 2106 and, by the equality (ZjZ 2 ) 8 = upx 2 , we deduce 

ZiZ 2 = | Zj, 2 I 2 = £/2106 = 2.60 

3 

But Zjz 2 z 3 = 3 (why is it so?), i.e. z 3 = = 1.15. 

Besides, Zj + z 2 + z 3 = —1 (why is it so?). Therefore if we put 
z i,2 = a ± ip then 2a + 1.15 = — 1 which implies a = — 1.08. 
But a 2 + p 2 = | zj, 2 l 2 = 2.60, that is p = /2.60 - 1.08 2 = 1.20. 
Thus, approximately, Zj, 2 = — 1.08 ± i 1.20 and z 3 = 1.15. 

Iterations by Newton’s formula (V.6) yield more accurate values 
z 1 2 = -1.087 ± il .172 and z 3 = 1.175. 

The Lobachevsky method, is especially convenient when the roots 
of the equation are real an'd different'in tlieir moduli. It is some- 
times possible to simplify the calculations. For instance, in solving 
the above problem it was possible to limit the calculations to finding 
only z 3 because after dividing by z — z 3 we can reduce the problem 
to a quadratic equation. 
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The Lobachevsky method and some similar methods are treated 
in detail in [22], [28], [38] and [50]. _ i 

10. Decomposition of a Rational Fraction into Partial Rational 
Fractions. Remember that a rational fraction (see Sec. 1.17) is 


a ratio of two polynomials: 


JK) P{s) a 0 z^+...+a n 


(36) 


If m < n the fraction is called proper (no matter what the values 
of the coefficients are) and it is called improper if otherwise. An 
improper fraction can always be represented as a sum of an entire 
rational function (i.e. a polynomial) and a proper fraction. For 
instance, we can achieve this by dividing the numerator by the 
denominator, according to the usual rule of division of polynomials. 
For example, 


2_ 

S 3 _ , 3: 2" — 1 1 2 

C 2_3~ z + 2 2_3' 2s2 + 5 2 1 2 2 2-p5 


and so forth. 

It is important to remark that a sum of proper rational fractions 
is also a proper rational fraction whereas this rule does not hold 
for fractional numbers. To prove this let us mark off the degree of 
a polynomial by the subscript. Then we have 


Q„ M . 5r<=> gmWAw+frWf.M - 

If thefractions on the left-hand side are proper then we have m <C n 
and m <Zn. But in this case the first summand in the numerator 
of the expression on the right-hand side is of degree m -}- ?i < n -}- n 
and the second summand is of degree m + n < n + n. The whole 
numerator is therefore of degree <; n -j- n whereas the degree of 
the denominator equals n -j- n. Hence, the sum is a proper fraction. 
Obviously, the same is true for any number of summands. 

Let (36) be a proper fraction. Factor the denominator in linear 
factors and combine similar factors [see formula (27)]. Then we can 
prove that the fraction can be represented in the form 


2M_ = _ij_ , __J2_ 

P(~) (z — Zj) a l ' (z — 2 j) ai _ 1 



Bi 


(- — z z) 


Ct2 


( Z - S2 )“2-l 



D, 




I 


D 2 



(z-. Zh fk~ l 


Z Sft 


(37) 
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where A t , A z , . . ., D ak are certain numerical coefficients. The 
rational fractions appearing on the right-hand side are called partial 
rational fractions of the first type- Thus, every proper rational frac- 
tion may be represented as a sum of partial rational fractions of 
the first type. 

To prove the assertion we transform the fraction as follows: 

£M - « 

P M a 0 (*-*,)« (=- Z 2 )“ 2 . . . (z -**)“* 

^ Q (z) K=— s i)-(=-s 2 )l 

«o(“ — 2 l) ai (“ — -2 )“ 2 ••• (* — z h) ak ( z 2— -i) 

= QW 

«0 ( z — Sl) ai-i . . . (z-z h f* 

Q(f) 

a 0 ( z 2— 2l) (z — Ci) ai (2 — 2 2 )“ 2_1 ... (2— 

The constant factor z 2 — z t may be combined with the coefficient 
a 0 and thus the number of linear factors in the denominators of the 
two resulting fractions is by one less than that of the original frac- 
tion. Repeating transformations of this kind for each fraction we 
again reduce the number of linear factors in the denominators by 
one and so forth. We can proceed in this way as long as there are 
at least two different factors in the denominators. After a certain 
number of transformations there will no longer be different linear 
factors in the denominators, that is we shall arrive at a sum of frac- 
tions of the form — — — 
a(z — zi) a ' 

If we expand the numerator into powers of z — z ; (see the begin- 
ning of Sec. IV. 15) we obtain 

c d 

Q (A _ c-M(z — z/)-f g(2 — 2 ;)” 1 _ a a 

a (z-2,) a a {z-zif (2 -z,) a + ( 2 -z / ) a_1 + " ' 

Here after the division is performed there may be an entire part 
(that is an entire rational function, a polynomial). If now we add 
together all the fractions thus obtained we shall receive final for- 
mula (37), and all the entire parts must mutually cancel because 
otherwise a proper fraction would appear in the form of a sum of 
a proper fraction and a polynomial, which is impossible. (Why is 
it impossible?) 

Practically, the decomposition is usually carried out by means 
of the method of undetermined coefficients. For this purpose we 
write the right-hand side of formula (37) with literal coefficients 
in the numerators and then find the coefficients. There are different 
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techniques for calculating the coefficients. We shall demonstrate 
them by considering an example. 

Let it be required to decompose the fraction 

x 3_ 2 z-f3 (38) 


into partial fractions. Since the fraction is proper we mite, by 
formula (37), 


.t3-2j + 3 A . B . C . D 

x (x — l)(x-f2) 2 x x — i (i-4-2) 2 x-\-2 


(39) 


where the coefficients, for simplicity’s sake, are all denoted by diffe- 
rent letters. Multiplying by the common denominator we receive 

a? _ 2x + 3 = A (x - 1) (* 4- 2) 2 + Bx (x + 2) a + 

+ Cx(x-1) + Dx (x - 1) (x + 2) (40) 

The equality must be an identity. We can therefore remove the 
parentheses and equate the coefficients of the same degrees of x. 
This yields the system of equations (verify it!) of the form 


x 3 

x- 


x 

1 


A + B + D =1 

3i4 + 45-i-C + D =0 
AB-C-2D= -2 
-4/1 -3 


from which it is easy to find 


(41) 


A = 1-= — 0 .750; 5-4 = 0.222; -1= —0.167; 

5 = §= 1 .52S (42) 

This method is the most reliable [equating the coefficients we 
are sure that relation (40) is an identity and therefore (39) is also 
an identity] but not the simplest. There is another method which 
we can illustrate by talcing (39) as an example. Without removing 
the parentheses in equality (40) we simply make x assume four 
different values according to the number of unknown coefficients 
to obtain equations for determining A, B, C and D . In our example 
it. is very convenient to put x = 0, 1 and — 2 since these values 
eliminate some summands and, in addition, to equate x to any 
arbitrarj 7 value, for example, to — 1. This results in the relations 


-4A =3; 9 B = 2; 6C = —1; —2A — 5 + 

+ 2C + 25 = 4 

which imply the same values (42). 


( 43 ) 
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The last method is the collocation method , that is the method of 
equating two expressions for different values of the argument. Such 
a method is especially effective in case the denominator of the fraction 
that should be decomposed has no multiple roots, that is all the 
linear fractions entering in its factorization have the first power. 

If fraction (36) has real coefficients and if the independent vari- 
able is considered real the denominator can nevertheless have ima- 
ginary roots. Then though decomposition (37) is possible it may 
sometimes be inconvenient. In such cases another decomposition 
is often used. Namely, departing from decomposition (37) and using 
formula (29) we can prove the validity of the following decompo- 
sition: 

Q (x) Q (*) __ 

p i x ) a 0 (x — Xi) al ... (*— ar r ) ar (* 2 + Pl*+?l) Pl ••• (*M Ps x + 

_ A 1 I A 2 ! I A ct! . I Dj | 

(x — xj)® 1 (x — x l ) ai ~ 1 ' x x i ( x — x r ) ar 

i A; , i Dg.r I MjX-f-N i . 

(x — x r )“ r_1 1 ~ x x r {xS + Pjx-J-^fiP 1 

, Mip + Nj , + , . Uix + v 1 , 

^ r (x2 + Pl* + 9i) Pl_1 ' X * + Pl* + <1 1 ‘ (iHwd?/' 1 " 

, 1 ( 44) 

' (x 2 +p s x + g s ) l)3 ~ 1 ‘ '^ x2 + Ps x + 9s [ 

The fractions on the right-hand side which have denominators 
equal to powers of quadratic trinomials are called the partial rational 
fractions of the second type. As before, all the coefficients entering 
into the numerators can be found by the method of undetermined 
coefficients. But since all the operations are performed now only 
on real numbers the unknown" coefficients which should be found 
from a system of equations of the first degree [obtained by analogy 
with system (41) or (43)] must necessarily be real. 

Thus, every proper rational fraction with real coefficients can be 
represented in the form of a sum of partial rational fractions of the 
first and of the second types with real coefficients. It may happen 
that there will be only fractions of the first type in case all the roots 
of the denominator are real or only fractions of the second type if 
all the roots are imaginary. 

We shall not give here the proof of the possibility of decomposi- 
tion (44). In every concrete problem the validity of (44) is confirmed 
by the results of the calculations. 

11. Some General Remarks on Functions of a Complex Variable. 
Each of the functions 

w—z z — iz, w— , w = ze z etc. (45) 
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assumes complex values for complex values of z. In the general 
form a relationship of this type can be written as id — f (z) where 
both z and w are complex. Many notions of the theory of real vari- 
ables can be transferred to the theory of complex functions of com- 
plex variables without any essential changes. This refers to the 
notion of a limit and to the properties of limits (of course, with the 
necessary exception of those properties that are connected with ine- 
qualities), to the notion of the continuity and to that of points of 
discontinuity and so on. The determination of the points of dis- 
continuity of a function is carried out in the same way as for the 
functions of real variables in Sec. III. 13. For instance, the first and 
the third of functions (45) are continuous for all z while the second 
one has two points of discontinuity z = ±i at which it approaches 
infinity. 

The definition of the derivative of a function of a complex vari- 
able is analogous to the definition of the derivative of a real func- 
tion (see Sec. IV. 2): 


dw 

IT 


A w 


f (z) — lim —■ = lim 


f(z + Az)-}(z) 


Ai-»0 


&z-*0 


A( 2 ) 


where the limit must be uniquely determined and independent of 
the process of approaching zero as Az 0 which can be quite arbi- 
trary. One can easily verify that all the properties of the derivative 



Fig. 188 

The dotted lines represent the mapping: w, = / (z,) etc. 


and all the differentiation formulas established in Secs. IV. 4-5 remain 
true without changes but we are not going to treat these questions 
in detail here. The notion of derivatives of higher orders and the 
Taylor formula and series (see Sec. IV. 5) that are based on this 
notion also remain true. We shall again discuss the question in 
Sec. XVII. 14. 

The geometric interpretation of a function of a complex variable 
w = f involves some essentially new ideas. Since the values 
° z are represented by points in a complex plane of the argument z 
and the corresponding values of w are represented by points in 
the complex plane of the variable w we see that in this case there is 
a certain correspondence between the points of the z-plane and the 
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points of the w-plane. We can also say that the points of the z-plane 
are mapped on the points of the w-plane. Or, in other words, the 
5-plane or its part where the function w = / (z) is defined is mapped 
into the zn-plane. For example, the function w — z 3 — iz performs 
a mapping under which the points z = 0, z — 1, z = i and z = 
= 2 — i are mapped on the points w = 0, w — 1 — i, w = 1 — i 
and w — 1 — 13i, respectively, and so on. (Check it up!) If the 
point z moves in the z-plane and traces a curve then w also describes 
a certain curve in the in-plane (see Fig. 188). Thus, curves are trans- 
formed into curves under the mapping w = / (z) and geometric figures 
in the z-plane are mapped on the geometric figures in the in-plane 
although the form of a geometric figure may he considerably changed 
by the mapping. 



CHAPTER IX 


Functions 

of Several Variables 


§ 1. Functions of Two Variables 

1. Methods of Representing. The concept of a function of any num- 
ber of independent variables and the corresponding notation were 
introduced in Sec. 1.11 and Sec. 1.12. We have already used the 
concept but it is necessary to discuss the ways of representing such 
functions in greater detail. 

The method of analytical representation of a function z = / (x, y) 
depending on two variables does not essentially differ from the one 
applied to functions of one variable whereas the tabular method 
becomes much more complicated in this case because now we have 
to represent the values of two independent variables and it is there- 
fore necessary to use a table with two entries. Such a table may have 
the following form: 

TWO-ENTRY TABLE 

3 = f (x, y) 


\ v 

X \ 

VI 

i/2 

V3 

■ 

V N 

Xi 

r u = /(xi, (/[) 

z i2=/(zi. yf) 

z 13 = / ( x l> 2/3) 

■ 

“LV = / Oil 22 n) 

x 2 

2 2i = /( i 2! y i) 

z 22 ~f ( x 2» Vi) 

-23 = / i X 2 i 2 / 3 ) 

■ 

z 2 N — f ( x 2i Un) 

■ 


... 

... 

fl 

H 


2 -Vl = / {Xu, IJI) 

z m 2 =1 ( x ao ’Js) 

2 AT 3 = / (%, 1 / 3 ) 

... 

“21X2V— / ( x j\.r, Vn) 


Here we have to denote the values of the function by two indices. 
Obviously, it is difficult to compile such a table if we have a great 
number of values for x and y. 
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“ypcopeiv” to write) which deals with methods of constructing nomo- 
grams, that is special drawings that are helpful for representing 
functions of any number of independent variables in a practically 



Fig. 192 


convenient way. The application of nomograms saves much time 
and effort in calculations and does not require any special quali- 


fication. It should therefore be 



Fig. 193 


widely recommended. There are 
many different types of nomo- 
grams. As an example we repre- 
sent a nomogram in Fig. 193 
which is designed for calculating 
one of the setting angles a y of 
a cutter in a cutter grinder for 
given tool angles a and (p. The 
values of a, cp and a y are marked 
on the three corresponding axes 
(two of them are curvilinear). If 
we apply a ruler to certain po- 
ints a and tp on the correspon- 
ding axes we can read the desi- 
red value of a y on the third axis. 
For instance, in Fig. 193 we have 
the values a = 10° and ip =30° 
which yield a y = 19.5°. 


2. Domain of Definition. The domain of definition of a function 


z = f (x, y) is the range of the independent variables x and y. If 
the independent variables are continuous the domain is either the 
whole x , p-plane or a region in the plane, or, finally, a totality 
of a number of regions in the plane. When we speak about a region 
in the x, p-plane we usually mean a connected ( simply-connected ) 
set of points, that is a set consisting of one entire part of the plane. 
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wliicli does not degenerate (this means that it is not a line or a point 
or a number of separate points), ^'e sometimes distinguish between 
closed regions (domains) which include their boundaries and open 
ones which do not include the boundaries. In other words, such 
regions in a plane play the same role as intervals on a straight line 
(see Sec. 1.5). 

For example, the domain of definition of the function z = x -j- xj 
is the whole x, y-plane. The domain of the function z = ~\fy — x 




Fig. 194 


(in case we consider only real values of z) is defined by the inequality 
xj — x 0 , i.e. xi x. The domain of the function z = — 7 ==== 

~\/i— x-— xi- 

is obtained from the inequality x- + xj" < 1 etc. These domains 
are shown in Fig. 194. 

3. Linear Function. According to Sec. 1.17, a linear function 
of two variables has the form 


z = ax + by + c ( 1 ) 

where a, b and c are constant coefficients. By analogy with Sec. 1.22 
we can easily derive a formula for an increment of the function! 

Az = a Ax + bky 

Similar formulas hold for linear functions of any number of vari- 
ables. 

Formula (1) having three coefficients, any linear approximation 
(i.e. an approximate replacement of a function by a linear function) 
requires three conditions. For instance, let the values of a function 
/ (x, y) be known: 

/ (*^I’ V i) %il f (Xa, ?/ 2 ) ~ Z 2 , / (x$, ?/ 3 ) = Z 3 

If we want to. construct a linear function (1) which takes on the 
same values (i.e. to carry out the linear interpolation) by substi- 
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fcutirtg (approximately) an expression of type (1) for / we must have 
ox i -f- by 1 -f- c = Zj'j 

cx 2 + by 3 + c = z 2 > (2) 

ax 3 + by 3 + c = z 3 J 

We determine the coefficients by solving this system. Such a re- 
placement of f by (1) yields good results if, in the first place, we 
consider the values of the arguments lying inside the triangle with 
the vertices (x it y^, (x 2 , y 2 ) and (x 3 , y 3 ) (see Fig. 195) and, in the 

second place, if the triangle is not large 
enough for non-linear properties of the 
function / to be manifested in a noti- 
ceable manner. Besides, the triangle must 
not have very acute angles. (If in a cer- 
tain limiting process one of the angles 
vanishes and the triangle turns into 
a line segment the determinant of system 
(2) also vanishes and our calculations 
become inapplicable.) 

If we replace / by (1) outside the trian- 
gle (this is the linear extrapolation) then, 
in general, the error will increase as we 
move away from the triangle. 

We can likewise carry out linear 
interpolations for functions of any num- 
ber of arguments. 

4. Continuity and Discontinuity. The concept of continuity of 
a function z = / (x, y) is quite similar to that of a function of one 
argument which was discussed in Sec. 1.16 and Sec. III. 12. As an 
example we can formulate the following definition of continuity: 
a function / is called a continuous function for the values x = x 0 
and y = y 0 of the arguments if for every process in which x -> x 0 
and y y 0 (in an arbitrary way) we have / (x, y) ->■ f (x 0 , y 0 ). 
If otherwise the function is said to be discontinuous for these values 
of the arguments. Then the point with the coordinates (x 0 , y Q ) 
lying in the x, p-plane is called the point of discontinuity of the 
function. A function which is continuous at each point of a region 
is said to be continuous in the region. 

It should be noted that besides separate points of discontinuity 
a function of two variables may have entire lines of discontinuity, 
that is lines wholly consisting of points of discontinuity. For instance, 
let us take the functions 

and z 



z = 


x n --\-y- 


( y — *) 2 
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The first function has only one point of discontinuity (0, 0) whereas 
the second function has an entire line of discontinuity, name y, 
the straight line (y - xf = 0, i.e. y = x. The level lines of the 
functions are depicted in Fig. 196. In both cases the functions 
approach infinity at the points of discontinuity. But, as in the case 
of a function of one variable, there are other types of discontinuities. 
In practical problems we often encounter such a line of discontinuity 
of a function that in approaching any point of this line from one 
side the function has a certain finite limit whereas in approaching 



Fig. 196 


the same point from the other side of the line the function has a 
different finite limit. In such a case the function has a finite jump 
as the point ( x , y) passes through the line. An approximate sketch 
of the graph of such a function is depicted in Fig. 197. 

The behaviour of a function of two variables in the vicinity of its 
point of discontinuity may essentially depend on the way the point 
is approached. For instance, there may exist a limit depending 
on the choice of a path of approaching the point for one group of 
paths and there may exist neither a finite nor an infinite limit for 
other paths etc. Since there is an infinitude of ways of approaching 
a point of discontinuity (whereas we have only two main ways of 
approaching a point of discontinuity in the case of a function of 
one argument, namely, from the right or from the left side) points 
of discontinuity of functions of several variables are, in general, 
of a more complicated type than those of functions of one independent 

variable. For example, the function z = has its only point 

of discontinuity at the point x = 0, y = 0 where the denominator 
vanishes. Now if x — >- 0 and y — »- 0 in such a way that — — k i.e. 

y = kx, where k is a constant we obtain z = ^ 

x- + k*x* 1 + A-- 


1 9 -^ 0 1 - 4-1 
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and therefore the limit depends on the relation between y and x 
(see Fig. 198). If in approaching a point there is no single finite 
or infinite limit of a function then calculating the limit at the point 
we must specify the way of approaching it to avoid misunderstandings. 

The properties of continuous functions of two arguments defined 
in a finite closed region in the x, p-plane are analogous to those 

described in Sec. III. 14 for 



functions of one argument 
defined over a closed finite 
interval. We are therefore 
not going to enumerate the 
properties again. 



Fig. 197 Fig. 198 

L) is the line of discontinuity of the function /; In approaching the origin in the 
it lies in the plane xOy direction a the limit of z is equal 

to 1; the limit is equal to zero in 
the direction b and it is equal 
to —1 in the direction c. There is 
no limit in approaching the origin 
along the spiral d 


We sometimes encounter the problem of solving an inequality 
of the form / (x, y) >0. This can be carried out in a way similar 
to the method of solving inequalities discussed in Sec. III. 15 [see 
inequality (III. 17)]. We should first draw the curve / ( x , y) — 0 
and the lines of discontinuity of the function / provided there are 
such lines. All these curves break the plane into parts, and the 
function retains its sign inside each of the parts. We can determine 
the corresponding signs by calculating the values of the function 
for the points arbitrarily chosen in each of the parts. 

For instance, let us solve the inequality 


x- + y2 — 4 
x+y 


>0 


( 3 ) 


Here the circle x~ ~j~ y 2 — 4 = 0 is the line of zeros and the 
straight line x + y — 0 serves as the line of discontinuity. They 
divide the plane into four parts shown in Fig. 199. Now we take 
a point in each of the parts, for example, the points ( — 3, 0), ( — 1, 0), 
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H 0) and (3, 0), and determine the corresponding signs of the 
function which aie +, - and +. The regions where inequality 
(3) holds are shaded in Fig. 199. 

5. Implicit Functions. The defini- 
tion of an implicit function of two 
arguments is similar to that in 
Sec. 1.20 given for • a function of 
one independent variable. An impli- 
cit function z (x, y) is defined by an 
equation of the form 

F (x, y, z) — 0 (4) 

Here, as-in Sec. 1.20, the function 
z (x, y) may turn out to be mul- 
tiple-valued and then we can con- 
sider its single-valued branches. 

Equation (4) may define a sur- 
face of an arbitrary form whereas 

a surface determined by an equation z = f (x, y ) is punctured by 
any straight line parallel to the z-axis at not more than one point 
(see Fig. 190). 



§ 2. Functions of Arbitrary Number of Variables 

6. Methods of Representing. The main notions related to the 
analytical form of a function and to its properties are transferred 
to functions of any number of arguments. But there are some addi- 
tional difficulties in investigating such functions. First of all, the 
tabular and the graphical ways of their representation become too 
complicated. We can, of course, represent a function of three variab- 
les by means of a system of two-entry tables (see Sec. 1) or by a set 
of pictures similar to Figs. 189 or 192, but this is very difficult. 

But in certain cases the calculation of the values of a function 
of a large number of arguments may sometimes be reduced to the 
calculation of the values of several functions of a lower number of 
variables. Then we can widely use the methods described in Sec. 1.13 
and Sec. 1. For instance, take a function of four arguments of the 
form u = / (x, y) -f- cp (z, t). To calculate the values of u we need 
tables and graphs of the functions / and cp. But each of the last two 
functions depends only upon two variables and this facilitates the 
calculations. Similarly, the calculation of the values of the function 
u ~ f I<P ( x ) + y, ( 2 ) — d depending on four independent vari- 
ables requires the representation of one function of two variables 
and of two functions of one variable and so on. In such cases the 
calculation and the investigation of functions become much easier. 
Unfortunately, not all functions can be represented in .this way. 

19* 
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7. Functions of Three Arguments. Another difficulty lies in the 
geometrical interpretation of a functional relationship in case the 
-whole number of variables (both dependent and independent) is 
greater than 3, that is greater than the dimension of our usual geo- 
metrical space. The situation is comparatively simpler for a function 
of three arguments u = f (x, y, z) (the total number of variables 
•equals 4 here). In this case the domain of definition is either the whole 
space of the variables x, y and z or some part of it, that is one or 
several regions (domains) in the x, y, 2 -space (see Sec. 2; the notion 
of a degenerated region should be appropriately changed in this 
■case). We can therefore represent such a domain geometrically. 
For instance, the domain of definition of the function u = x 2 -(- 
-f - y 2 — z is the whole space whereas the function u = \r i — x 2 — y 2 — z 2 
is defined only if 

1 — x 2 — y 2 — z 2 >0 or x 2 -f- y 2 + z 2 ^ 1 

i.e. in the last case the domain is the sphere of radius 1 with centre 
at the origin of coordinates. 

We can also consider level surfaces of a function f (x, y, z) under- 
standing them in the sense of Sec. 1, that is as surfaces in the x, y, z- 
space on which the function is constant: f (x, y, z) — const. 

Points of discontinuity (provided a function has them) lie in 
the x. y, z-space and can also be represented in a visual way. The 
points of discontinuity of a function of three independent variables 
can be located separately but they can also form lines of disconti- 
nuity and even surfaces of discontinuity , that is surfaces which enti- 
rely consist of points of discontinuity. For example, when investi- 
gating a passage from one physical medium into another we inter- 
pret the interface as a surface on which the quantities characterizing 
the properties of the media have discontinuities (for instance, this is 
applicable to investigating the passage of the light from water into 
air or from glass into air etc.). 

8. General Case. The concept of a space of variables is quite 
visual and it is therefore convenient to transfer it to the case of 
functions of any number of independent variables. This leads to 
the notion of a many-dimensional Cartesian space (see Sec. VII. 18, 
example 3). For instance, let a function w = / (x, y , z, u) of four 
arguments be considered. Then every quadruple of values x, y, z 
and u determines a point in the space E 4 (strictly speaking, such 
a quadruple is, by definition, a point of E^j. Thus the space 
serves as the space of arguments here; the domain of definition of 
the function in is a region (domain) in the space 2? 4 or a set of a num- 
ber of such regions. The function may be continuous or may have 
separate points of discontinuity, lines of discontinuity, two-dimen- 
sional surfaces or three-dimensional hyper surf aces (see Sec. VII. 19) 
consisting of points of discontinuity. 
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To construct the “graph” of a function u ~ f (x, y, z, t ) we need 
the five-dimensional space E h of the variables x, y, z, t and u. To 
find the points of the graph we must make x , y , z and t assume 
arbitrary values and calculate the corresponding values of u. [For 
instance, we can easily verify that the graph of the function u 

— xz — 2 y 2 t passes through the points (1, 1, 2, 0, 2) and (— 1, 2, 
0, —2, 16) etc.] When we discussed functions of three variables 
in Sec. 7 we saw that the space of the arguments had a visual geo- 
metrical interpretation. But this is not so for its graph that lies 
in the four-dimensional space E A which we cannot visualize. 

We shall see in Sec. X.2 that a many-dimensional space can be 
interpreted in a less formal way than that connected with the notion 
of a Cartesian space of n-tuples consisting of n numbers. 

9. Concept of Field. We say that there is a field of a quantity 
in space if a certain value of the quantity is defined at each point 
of the space. For instance, when investigating a flow of gas we con- 
sider the field of temperature (the temperature has a certain value 
at each point), the field of density, the field of velocities and so on. 
A field can be a scalar field or a vector field depending on the pro- 
perties of the quantity in question. For example, a temperature 
field and a density field are scalar onesAvhereas a velocity field or 
a field of force is a vector field. A field is called stationary (steady- 
state) if it does not change at each point of the space as time passes 
and it is called non-stationary if such a change takes place. 

For definiteness, let us denote a scalar quantity by the letter u. 
and an arbitrary (variable) point in space by the letter M. Then 
to each position of the point M there corresponds a certain value 
of the quantity u and we can therefore regard u as a function of M : 
u — f (M). Such a function differs from those considered above 
since a point is not a quantity. But in the general sense of the notion 
of a function (widely used in modern mathematics) we can apply 
the term “function” to every situation when there exists a certain 
law according to which to the objects of one “kind” (of an arbitrary 
nature) there correspond the objects of some other “kind” (in our 
case the objects of the first kind are points in space and the objects 
of the second kind which correspond to the points are the values 
of the quantity u). In case a field is non-stationary we have u — 

— f [M, t ) where t is the time. 

Now we can easily pass from a function of a point to a function 
of three variables, namely, of three spatial coordinates. To do this 
it is sufficient to introduce a Cartesian coordinate system x, y, z 
in space. _ Then the position of a point M in space is completely 
characterized by the corresponding values x, y and z, that is we 
can write u - u (x, y, z). Conversely, if a coordinate system x, y, z 
is given then any function of x, y and z can be regarded as a function 
oi a point. But we should take into account that a field u = / (M) 
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(i.e. a- function u of the point M) has its own meaning and can be 
investigated without introducing any coordinate system. Besides, 
introducing coordinate systems in different ways we obtain different 
relationships of the form u — u (x, y, z) for one and the same 
function u = / (M). Hence, when investigating a field we regard 
the concept of a function of a point as a primary concept relative 
to the concept of a function of the coordinates of a point. 

If a quantity, according to its physical or geometrical meaning, 
depends on the position of a point in a plane we call the corres- 
ponding field a plane field. We encounter such fields when investi- 
gating thermal processes in a thin plate whose thickness can be 
neglected. 

If we have a space field of a quantity u which depends only on x 
and y and does not depend on z in a certain coordinate system x , y, z 
the field is said to be a plane-parallel field. Then we can regard such 
a field as being defined in the x, y-plane (and as independent of z) 
which means that the field can be treated as a plane field. But of 
course we should keep in mind that in reality the field is spatial 
and that the relationship between u and the coordinates is the same 
in all the planes parallel to the x, y-plane. 

§ 3. Partial Derivatives and Differentials 
of the First Order 

10. Basic Definitions. Let a function of several independent 
variables be given. For definiteness, let it be a function of three 
arguments u — f (x, y, z). If we fix certain values of all the argu- 
ments but one the variable u becomes a function of this single argu- 
ment. We can therefore differentiate the function with respect to 
the argument and take its differential as it was done in Chapter IV 
for a function of one independent variable. Such derivatives are 
called partial derivatives. The corresponding differentials are partial 
differentials of the function. In other words, 


u' x = f'x {x, y , z) = lim , 

ia :->0 

u'y = fy(x, y, z)= lim 

where A x u — f [x + Ax , y , z) — / [x, y, z) and A v u — f {x, y + 
+ Ay, z) — / (x, y, z) axe the partial increments of the function. 
A partial increment corresponds to a change of one of the variables 
when all the other variables are kept constant. Let the reader write 
the expression of the increment A z u and of the derivative u' z . One 
shbuld take into account that the symbol u’ or f (x, y, z) has no 
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sense for a function of several variables because one must necessarily 
indicate the variable with respect to which the derivative is taken. 

The computation of partial derivatives of concrete elementary 
functions is performed according to the rules given in Sec. IV. 5. When 
we differentiate with respect to a certain variable we must regard 
all the other variables as constants. 

Example. Let u = x 2 z* — y z then u' x = 2zz s , if = —zf~ and 
u' z = ox-z- — if In y (check up these results!). 

A partial differential is denoted by the symbol d with a subscript 
indicating the argument with respect to which the differentiation 
is performed. Thus, 

d x u = u' x Ax , d y u = if Ay and d z u = u z Az 


In particular, if we put u = x here we obtain 

d^x = Ax and dyX = d z x — 0 (5) 

since x' x = 1 and x' y = x' z — 0. We have therefore d x u = u x d^x 
which implies u' x = ~ . One usually omits the subscripts in 

° xX da 

putting down the last formula and simply writes u' x — ^ because 

the denominator itself indicates that the derivative is taken with 
respect to x (or with respect to some other variable in other cases). 

Similarly. u' y — ~ and u' z — ^ . This rule of writing differentials 

sometimes becomes inconvenient. Indeed, in the first place, not 
only denominators differ in the last two expressions but the nume- 
rators as well. Actually, the numerator of the first fraction must be 
regarded as d y u whereas the numerator of the second fraction as d z ii. 

In the second place, for example, we cannot write — instead of 

since the differentials in these expressions have a different 

sense (in the first expression the differentials are taken with respect 
to u whereas in the second one with respect to x). 

When using partial derivatives one must be careful and pay 
much attention to the choice of independent variables. For instance, 
if we write the expression for the power of an electric current in 

the form P = where U is the voltage and R is the resistance of 

the circuit we obtain ^ . But if the same formula 

is written in the form P = I 2 R where I is the flow of the electric 
current then we get = P = ^ . There is no contradiction bet- 
ween these results. Actually, if we write them in full we shall have 

^Ucoast = for the first result ^ X| J=const = J ^ the 
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second one. (Let the reader think about the physical meaning of 
the signs -J- and — entering into the last formulas.) 

11. Total Differential. The total (or exact) differential of a function 
u = f (x, y, z ) is equal to the sum of all its partial differentials: 

du = d x ii -f- d y u-\-d z u = u' x Ax -J- u' v Ay ff-u' z Az — 

=i A *+l>+fi- Az ( g ) 

Formula (6) must be regarded as the definition of a total differential. 
In particular, if we put u — x formula (5) implies that 

dx — dyX + d y x -f- d 2 x = Ax 

and hence the total differential of an independent variable is equal 
to the increment of the variable (compare with Sec. IV. 8). Formu- 
la (6) can therefore be rewritten as 

du = u x dx J r u'y dy + u' z dz = ^ dx + dy -f- dz (7) 

For example, 

d lx 2 sin—) = (2xsin — cos—) dx — — cos— dy 

\ y ) \ y y y r y 2 y J 

The connection between the total differential of a function and 
the total increment of the function is analogous to the one described 
in Sec. IV.8. Let the independent variables receive increments Ax , 
Ay and Az. Then u receives the increment 

Au = / (x + Ax, y + Ay, z + Az) — / ( x , y, z) 

This is the total increment. It can be represented in the form of a 
sum of three increments so that each increment corresponds to the 
change of one of the variables. Namely, 

Au = [/ (x - b Ax, y, z) — / (x, y, z)l -f 
+ [/(* + Ax, y + Ay, z) — f (x + Ax, y, z)] + 

+ (/ (x + Ax, y + Ay, z + Az) — / (x + Ax, y -f Ay, z)] (8) 

The reader must carefully check up this formula! The increments 
in the square brackets entering into formula (8) are nothing but 
partial increments and are therefore connected with the partial 
differentials in the way described by formula (IV.23). Hence, 

/ (x + Ax, y, z) — / (x, y, z) = f' x (x, y, z) Ax + cci Ax, 

/ (x + Ax, y + Ay, z) — / (x -f Ax, y, z) = 

= f v (x + Ax, y, z) Ay + a 2 Ay = 

= l/iz (x, y, z) + a 3 ] Ay -f a 2 Ay = 

= fv (x, y, z) Ay + a 4 Ay, 

f (x + Ax, y + Ay, z -f Az) — / (x + Ax, y + Ay, z) = 

= fz ( X , y, z) Az + a 5 Az 
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(the last formula is deduced in like manner) where all the quantities 
denoted by a tend to zero as Ax— >- 0, Ay — >- 0 and A z->- 0. Sub- 
stituting these expressions into (8) we obtain 

A u = u' x Ax + u~y Ay + u' z Az + cc Ax + p Ay + V Az = 

= du -1- a Ax + P Ay + V Az (9) 

where a, p and y -*■ 0 as Ax, Ay and Az 0. Consequently, in the 
general case we can also say that the total differential is the principal 
linear part of the increment of a function. We call it linear because 
it equals the sum of summands which are directly proportional 



Fig. 200 


to the increments of the arguments and it is regarded as the prin- 
cipal part of the total increment since it differs from the increment 
by an infinitesimal variable of higher order with respect to the in- 
crements of the arguments (compare with Sec. 3). As in the case 
of a function of one independent variable, the replacement of an 
increment of a function by' its differential is equivalent to the repla- 
cement of a non-linear function by a linear one. 

Formula (9) is illustrated in Fig. 200 for the case of two indepen- 
dent variables. The values of the function (with the infinitesimals 
of higher order neglected) are put down near the corresponding 
points of the x, y-plane. 

As in Sec. IV. 10, the approximate relations A x u « d x u , A y u m 
~ d y u, A z u » d z u and, particularly, A u « du are the source of 
many useful concrete approximate formulas. We note here that 
the last formula can be rewritten in full as 

f {a h, b + k, c -f /) « / (a, b , c) + f' x (a, b , c) h + 

+ fy (a, b, c)k + f’ z (a, b, c) l 
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It is also easy to deduce the formula 

cc tt =( f'x(x, y, z) («* + !/«(*. */, z)l%+l/z(*> Z/> Z )l<*z 

-which is analogous to formula (IV.29) and is obtained in a similar 
-way. 

For instance, let u = xy. Then 


6a : 


a. a = \y\a x + \x\a y , 

lllsdll ,L g Z.. s «*. + -Sl = s, + 6„ 


that is the operation of multiplication (and, similarly, the operation 
of division) of approximate numbers yields the addition of their 
maximum relative errors (let the reader check up this rule for the 
case of division!). 

12. Derivative of Composite Function. Let again u. — f (x, y, z ) 
and let the variables no longer be independent but in their turn 
depend on some independent variables s and t. Thus we have 

u = u(x, y, z), x = x ( s , t), y = y (s, t),' z = z {s, t ) (10) 

We see that u becomes a composite function of s and t. To calculate 
the partial derivative u' s let us fix t and make s receive an increment 
As. Then x, y and z will also receive certain partial increments and 
therefore u also gains an increment which, according to formula (9), 
can be written in the form 


A s u = u' x A sX + u'y A s y + u' z A s z + a A sX + p A s y + y AsZ 


If now we divide both sides by As and pass to the limit as As 0 
we shall obtain 


/ du ii, II, ii du dx , du du , du dz .... 

W * = ^= W ^ + U ^ + a 2Zs = — + + (11) 


The derivative u\ is expressed similarly. Thus we see that the rule 
we have deduced is analogous to that of differentiating a function 
of one independent variable [see formula (IV.9)] but the number of 
summands is greater here since the derivatives with respect to all 
intermediate variables also enter into the differentiation formula. 

Formula (11) implies [this is analogous to the results obtained 
in Sec. IY.9] that formula (7) [but not formula (6)!] remains true 
even in the case when the former independent variables turn out 
to be dependent on some other variables. In fact, in the case of for- 
mula (10) we have 


du = u't ds -f- u' t dt — {u. x' s + u' y y' e + u'z') ds + 

-f- (Mx x t ~r 1 -{- u z z t ) dt — u x (.t 5 ds -{- x , dt) -f- 
+ u' u (j/j ds -j- y’ t dt) + u' z (z(. ds + z' t dt) — u' x dx -}- u' y dy + u' z dz 
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which is what we set out to prove. Hence, formula (7) is invariant, 
that is holds for all cases [just as it is for formula (IV.22)J. , 

The invariance of the form of the total differential of a function 
implies many differentiation formulas. For instance, if w — uv 
where u and v may depend on some other variables then formula (7) 
yields dw = w' u du + w‘ v dv — v du + u dv. Consequently, the for- 
mula d (uv) = v du + u dv holds for all cases. In a similar way 
we can verify the validity of the following formulas: 

d (u zb v) — du zb dv, d ( Cu ) — Cdu (C = const) , 

d (— ) _ vju~u dy _ , d(u n ) = nu n ~ 1 du, d (sinu) = cos udu etc. 

In many cases these formulas enable us to calculate a total diffe- 
rential directly, without computing the corresponding partial 
derivatives. 

For example, d (sin x 2 y 3 ) = cos x 2 y 3 d (x 2 y 3 ) — cos (x 2 y 3 ) [y 3 d ( x 2 ) + 
+ x 2 d (y 3 )] — cos x 2 y 3 (2xy 3 dx + 3x 2 y 2 dy). Conversely, the total 
differential of a function being given, we cap restore the partial 
derivatives determining them by taking the coefficients in dx and dy. 

Here we give several examples on calculating derivatives. 


(1) Let u — f{Y a: 2 + y 2 ). Then 

u * = /' (V x 2 + y 2 ) {V x 2 + y% = f (l^x 2 -|- y 2 ) —-=== = 

2\/x 3 + y" 


X 

y^ + y* 


f iVx 2 + y 2 ) 


Here the function / itself is a function of one variable. This variable 

is replaced by V x 2 + y 2 ; the symbol /' designates the derivative 
of / with respect to its single argument. 

(2) Let = y). Then 



v ) ( (f’T ’ y) 


Here the function / itself is a function of three variables. The expres- 
sions — , — and y are substituted, respectively, for these indepen- 
dent variables; the expressions /), f u and f IU designate the deri- 
vatives of f taken with respect to these three variables. 

(3) Let y = x sm *. Then to compute y' we can take - logarithms 
as it. was recommended in- "the end of Sec. IV.5. But we can 
also use another method based on the above results. Let us 
denote y = where u = x and v = x (such intermediate 
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variables are usually introduced mentally without putting them 
down). Now we can differentiate y as a composite function ( u and v 
are regarded as intermediate variables): 

y'x = y'uUx + y'rV'x — sin v u s,n ®- 1 • 1 + u sin v In u cosy - 1 = 

= sin x x sin 1 -f- x sin x In x cos x 

It is obvious that the last method is more general. If we have 
to compute the derivative with respect to x of an expression into 
which x enters several times we should differentiate with respect 
to each (“imagined”) argument that involves x, multiply by the 
derivative with respect to x and add together the results obtained. 

Here we are only going to consider functions of three variables 
but the following definition will also be useful for our further aims 
since it can be easily transferred to the case of functions of any 
number of independent variables. A function F (x, y, z) is called 
a homogeneous function of degree k if for any ( >0 we have 

F ( tx , ty, tz) == t h F ( x , y , z) (12) 

For instance, the function F (x, y, z) = x 1 — 3yz is a homogene- 
ous function of degree 2 because 

F ( tx , ty , tz) — (tx)’ 2 — 3 (ty) (tz) — t 2 ( x 2 — 3 yz) — 

= t 2 F (x, y, z) 

xsin (-f-) 

We can similarly verify that the function - — - — is a homoge- 
neous function of degree zero, the function 1 is a 

1 fx—y — z 

A 

homogeneous function of degree — whereas, for example, the 

function x + 2y — z -}- 1 is not a homogeneous one at all. 

In the general case formula (12) can be written as F (ta, tb , tc) ~ 
= t h F (a, b, c) for any a, b and c. If we differentiate with respect 
to t (and regard a, b and c as constants) we obtain 

F' x (ta, tb, tc) a F' v (ta, tb, tc) b + F' z (ta, tb, tc) c = 

= kt h ~ i F (a, b, c) 

Now putting t — a — x, b = y and c = z in the last formula we 
deduce the formula 

xF' x (x, y, z) + yF' y (x, y, z) + zF' z (x, y, z) = kF (x, y, z) 

which expresses so-called Euler’s theorem on homogeneous functions. 

13. Derivative of Implicit Function. Let an implicit function 
z —z (x, y) be defined by an equation 

F (x, y, z) — 0 


(13) 
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To calculate the partial derivative z' x we must fix y and differentiate 
formula (13) with respect to x taking into account that z also depends 
on x. Thus, applying the rule of differentiating a composite function 

we obtain „ .... 

F’X + F'A - 0, i.e. F' x + F% = 0 (14) 


which implies 


Zx — 


F'x (z, y, z) 


(15) 


F' z ( x , y, z) 

Similarly, z’ y = — ^ . To guarantee that this derivative assumes a 

F Z 

finite value we should additionally introduce the requirement 

F'z (x, y, z)¥=0 (16) 


Formula (16) expresses a sufficient condition for the existence 
of an implicit function z = z (x, y) defined by equation (13); the 
geometrical meaning of the condition will be illustrated in Sec. XII. 3. 

Implicit functions may be defined by a system of equations. Sup- , 
pose we have m equations which are compatible (that is they do not 
contradict each other and can be solved, at least theoretically), 
independent (this means that none of the equations is the consequence 
of the others) and connect n variables. Then if m < n (i.e. the number 
of equations is less than the number of independent variables) we 
can regard certain n — m variables as independent. We can make 
them take on arbitrary values and express the remaining m variables 
as functions of these independent variables by solving the equations. 
(If the number of equations is equal or exceeds the number of vari- 
ables then, generally, we get some discrete values and it is therefore 
impossible to construct functions.) As an example let us take the 
case of two equations containing five variables, namely, 


/ (x, y, z, u, v) = 0 1 
<p {x, y, z, u, v) = 0 J 

Here we can regard three variables as independent and the other 
two variables as functions of these independent variables. For de- 
finiteness, let us assume that u = u (x, y, z) and v = v (x, y, z) 
and try to compute the derivatives u' x and v' x . For this purpose we 
differentiate both equations (17) (y and z are regarded as fixed). 
This yields 


fx'l J rfLu x + f! D v' x =0 | fuUx + fvV'x— ~fx \ 

<pi • 1 + + y'vVx — 0 J l e ' Tu«x + <p(|l4— — <Px J 

Hence, we have a system of two equations of the first degree 
h.e. system (18)] containing the two unknown quantities u x and v x . 
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If the determinant of the system is not equal to zero the system can 
he solved (see Sec. VI. 4). Let us therefore suppose that 


fu /; 

CPu Tr 


0 


(19) 


Solving the equations we can find the sought-for derivatives. 

The derivatives with respect to y and z are found similarly. It 
is important that the determinant of the system for the derivatives 
with respect to y and z is equal to (19) again (only the right-hand 
sides of the system differ from the former ones). Consequently, 
(19) is a sufficient condition for existence of the implicit functions 
u = u (x, y , z) and v — v (x, y, z) which are defined by system 
of equations (17). In the general case such a condition is derived in 
a similar way and is analogous to (19). 

A functional determinant (i.e. a determinant whose elements 
are the derivatives of some functions) of form (19) is widely encoun- 
tered in mathematics and is called a Jacobian after K. Jacobi (1804- 
1851), a German mathematician. There is a special symbol for 

D(f, <P) 


designating such a determinant: 


f'u /; 
Tu Tu 


The expression 


D (u, v) • 

should be regarded as an indivisible symbol because at the 


D (A <p) 

D (u, v) 

present moment the denominator and the numerator taken separa- 
tely make no sense to us yet. 

Analogous questions arise when we have to solve a system of 
equations containing parameters. For example, let the following 
system of two equations with the two unknown quantities x and 
y be considered: 

/ (x, y, a, p, y, . . .) = 0 1 
rp (x, y , a, fi, y, . . .) = 0 J 

where a, p, y, . . . are parameters. Suppose that the system has the 
solution x 0 , y 0 for certain values ct 0 , Po* Y<n • • • of the parameters 

and 0 for these values. Then, by the above reasoning, 

the system defines x and y as functions of a. p, y, . . ., that is the 
system has a uniquely defined solution as the parameters vary and 
take on certain values lying in the vicinity of the values a 0 , Po. To 

By the way. as it will be shown in Sec. XII. 3, the condition 

D (x, y) ^ 

=r = 0 guarantees only the local solvability of the system because 
the system may not be solvable if the increments of the parameters 
become too large. It should be underlined that such a “stability” 
of a solution with respect. to variations of the parameters can be 
guaranteed only if the number m of the equations is .equal to the 
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number n of tiie unknown quantities. If m <C n then n — m un- 
known quantities remain arbitrary and if m > n then the solvabili- 
ty of the system implies that there are m — n additional relation- 
ships between the parameters. 

§ 4. Partial Derivatives and Differentials of 
Higher Orders 

14. Definitions. For definiteness, let u = / (x, y, z) (functions 
of any n um ber of arguments can be investigated similarly). Then* 
as has been shown, we have the three partial derivatives of the first 

order u' x — f' x ( x , y, z), u v — f y (x, y, z) and u'~ — f' z (. x , y. z). 

Each of them can be differentiated repeatedly with respect to x, y 
and z. Hence, we obtain the following nine partial derivatives of 
the second order: 

u xx = f xx {x , y , z), u X y — f X y (x , y, z), , u x z f xz i,x , \j , 2 ), 

Uyx = fyx {x, y, z), llyy — fyy (X, 1/ , Z) , Uy z f J/Z V 5 t 

ulx = f 2X (x, y, z), u"y = fly (x, y, z) and u'' zz = /« (x, y, z) • 

The differentiation of elementary functions represented explicitly 
is performed in accordance with the rules given in Sec. IV. 5. The 
differentiation of implicit functions is achieved by the repeated 
differentiation of equalities of type (14), (15) or (18) and the like. 
Derivatives of orders higher than the second are defined analogously. 

Partial differentials of higher orders are defined in a similar manner 
and, just as it was done in Sec. IV. 12 and Ses. 10, we arrive at 
the equalities 

3% x u = u xx dx l , dlyU = u'ly dx dy etc. (20) 

where the differential of the independent variable x is understood 
as dx — Ax = d^x and so on. From this we deduce 

« _ d% x u d 2 u „ &xy u cfiu „ d"u . 

**“ (3**)* ~a**’ UxV ~ (dxx)(d v y) —~tedy' Uxz ~ ~dxW etC ‘ 

The notion of a partial difference of a function of several variables 
can be defined in a way similar to the one used in Sec. V.7. But in 
this case we must indicate the variable with respect to which the 
difference is taken. The differences can be taken with different steps 
for different variables. For example, let z == / (x, y); then we can 
designate the step along the x-axis by h and use the symbol A;, for 
denoting the partial difference with respect to x: A h z = / (x + h, y) — 

— / (x, y). Similarly, let k designate the step along the y-axis and 
let Ajj be the symbol for the partial difference corresponding to y: * 
A k z — / (x, y + k) — f (x, y). Then it is natural to introduce the 
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notation 

= A h (A ft z), A l k z = A ft (A h z) etc. 

The connection between partial difference quotients and the 
derivatives is expressed by the formulas 


: lim 

/ i -*0 


A /!= 


z' y — lim 

h-*0 


Ah- 


z'xx — lim 

h -+0 


AL- 

Ifi 


, = lim^ /l,1 ‘' 


h-y 0 
ft-*0 


hk 


and the like. 

15. Equality of Mixed Derivatives. Let z — f (a c, p). Then tlie 
function has four partial derivatives of the second order, namely 
z^x, Zxyi z'yx and z'yy- It turns out that the derivatives z" xy and z' yx 
which are called mixed partial derivatives are equal: 

1 4y = Zyx (21) 


that is the mixed derivatives are independent of the order in which the 
differentiation is performed. (This is true in case z" xt) and z“ x are con- 
tinuous.) 

To prove formula (21) it is sufficient to observe that, according 
to the end of Sec. 14, 

1 1 

z x ,j— lim Amz, Zy X — lim -rr-AhhZ (22) 

h,l i->0 llk h,h -* o nk 

At the same time 

A^ z = A ft (A h z) = A h [f (x + h, y) — f ( x , y)\ = 

= If (x + h, y + k) — f (x, y + k)\ — [f (x -f h, y) — 

— fix, y)] = / {x + h, y + k) — f ( x , p + ft) — 

— f {x + h, y) + f {x, y), 

Ahh z = A h (Aftz) = Ah [/ (x, y + k) — f (x, p)] = 

= [/ (x + /», p + /e) — / (x + h, p)l — [/ (x, p + &) — 

— / (x, p)] = / (x + h, y + k) — f (x + 7z, p) — 

— / (x, p + k) + / (x, p) 
i.e. 

A w, z = Afth z 

and hence the mixed differences are independent of the order in 
which they are calculated. From this and from (22) it follows now 
that formula (21) is true. 

If now we consider the derivatives of orders higher than the second 
of a function it is permissible, in accord with formula (21), to inter- 
change any two subsequent operations of differentiation carried out 
with respect to any variables. Thus, we can pass from any order 
of performing the differentiation to any other order. Hence, only 
the number of differentiations with respect to the corresponding 
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variables is essential here but not the order in which the operations 
are performed. For instance, 


,.iv _ 7 ,iv _ u rv 

ll xxyz — ll xyxz u zyzx 


= U 


-IV 


-/IV 


etc. 


but at the same time these derivatives are unequal to u X y yz . 

Partial differentials [see formula (20)] are also independent of the 
order of differentiating a function. 

16. Total Differentials of Higher Order. The total differential of 
any given order is defined as the total differential (see Sec. 11) of 
the total differential of the preceding order. As before (see Sec. IV. 12), 
the differentials of independent variables should be regarded as 
constant quantities in subsequent differentiations. For example, 
let z == / ( x , y). Then 


dz = z' x dx -f Zy dy, 

d 2 z = d (dz) = (z' x dx 4 z' y dy)' x dx + (z' x dx 4 z' y dy)',) dy = 

= dx" 4 Zy X dy dx + z" y dx dy + z” y dy 2 = 

= z!'xx dx 2 + 2z^ dx dy + z yy dy 2 (23) 


Here we have used formula (21). Further, we have 


d 3 z — (Zxx dx" + 2z" y dx dy 4 z yy dy 2 ) x dx + 

4 ( z xx dx 2 -j- 2z xy dx dy 4 z vv dy-)' v dy — 

= (z m xxx dx 3 + 2z.4 y dx 2 dy + z’” uv dx dy 2 ) + 

4 ( z xxy dx 2 dy 4 2 z xvu dx dy 2 4 z yyy dy 3 ) = 

= z xxx dx 3 4 3z xxy dx 2 dy 4 3 z xyv dx dy 2 4 z£ yy dy 3 

We see that here, as rvell as in deducing formula (IV. 32), the cal- 
culations aie performed according to the scheme in which we succes- 
sively remove the brackets in the expressions (a 4 b ) 2 , (a 4 b) 3 etc. 
In the general case we have 


d " Z ~ 'fer dx n 4 ( J ) -g-J-igy dx n ~ 1 dy + ( J ) T J^ dlJ , dx n ~ 2 dy 2 


d n z 


' dyndy" 

This result can be written in the following symbolical form: 


cpz ~ ( dx i+ dy Tf) z 

In the right-hand side of the last formula the brackets should be 
removed as if d, dx, dy, dx and dy were ordinary algebraic factors 
In a similar way, if 


u = f(x,y, z) then d n u= (dx-L+dy± + dz -^) n u 
and so on. 


20-0141 
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If z = / ( x , y) but x and y are no longer independent variab- 
les we must change formula (23) in a manner similar to the one 
used in the end of Sec. IV. 12: 

drz — d (4 dx + z'y dy) = d (z' x dx) -f- d (z' y dy) — d (z' x ) dx -f- 
+ z’ x d {dx) + d {z’y) dy + z' y d {dy) = {z" Xx dx + z" xy dy) dx + 
-f z' x d 2 x -f {z" vx dx + z yy dy) dy + z' v d 2 y = 

= z" xx dx- + 2z" xy dx dy + z" yv dy 2 + z' x d 2 x + z' v dry (24) 

The expressions for subsequent differentials are changed in a simi- 
lar way. 



CHAPTER X 


Solid Analytic Geometry 


§ 1. Space Coordinates 

1. Coordinate Systems in Space. Besides Cartesian coordinates 
described in Sec. VII. 9, the following coordinate systems are widely 
used. 

1. Cylindrical coordinates p, <p and z are shown in Fig. 201. To 
construct a cylindrical coordinate system Ave choose polar coordi- 
nates (in the x, i/-plane) and add the third coordinate z to them. 




Obviously, for all the points in space to be described, it is sufficient 
that we talce the following ranges for p, cp and z: 0 ^ p < oo, 
— it < cp ^ it and — oo <z< oo, 

The coordinate surfaces, that is surfaces on which one of the coor- 
dinates is constant whereas tjiie other two vary, form three families 
of surfaces, namely p = const, q> = const and z = const. These 
surfaces are depicted in Fig. 202. All these surfaces are of course 
regarded as being extended to infinity. The coordinate curves, that 
is curves on which two coordinates are constant whereas one of the 
coordinates varies, constitute three families of curves which are 

20 * 
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formed by the intersection of the coordinate surfaces. The coordinate 
curves are shown in heavy lines in Fig. 202. (By the way, the coor- 
dinate surfaces of a Cartesian coordinate system are the planes 
parallel to the planes xOy , yOz or zOx, and the coordinate curves 
are the straight lines parallel to the coordinate axes Ox, Oy or Oz.) 

Let us take the Cartesian coordinates ( x , y, z ) and the cylindri- 
cal coordinates (p, qp, z) which are placed as it is shown in Fig. 203. 




Then the relationship bet-ween the coordinates is expressed by the 
formulas x = p cos cp. y — p sin <p and z — z. 

Cylindrical coordinates are often applied to investigating solids 
and surfaces of revolution (such as a circular cylinder or a circular 
cone etc.), and the z-axis is placed along the axis of revolution in 
such investigations. 

2. Spherical coordinates (which are sometimes called spatial polar 
coordinates ) are shown in Fig. 204. These coordinates are analogous 
to the geographical coordinates. The distinction between them 
is that the “latitude” 8 is reckoned here from “North Pole” whereas 
in geography it is reckoned from the equator. To describe all the 
points in space it is sufficient to take r, 0 and qp within the limits 
0 ^r-<°o, O^O^ji and — n < 9 ^ it. 

The coordinate surfaces and curves are shown in Fig. 205. If 
Cartesian coordinates (x, y, z) and spherical coordinates (r, 0, qp) 
are located as it is shown in Fig. 206 then the relationship between 
them is expressed by the formulas 

x — p cos (p = r sin 0 cos qp, 
y = p sin cp = r sin 0 sin cp, 
z — r cos 0 

Spherical coordinates are especially convenient for investigating 
solids bounded by surfaces of the type shown in Fig. 205 but they 
are also applied in many other cases. 
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Cartesian, cylindrical and spherical coordinates are particular 
cases of the so-called orthogonal coordinates in which any two inter- 
secting coordinate curves form a right angle (check up this property 
for the above coordinate systems!). Some non-orthogonal coordinate 
systems are also applied to certain problems (for instance, general 
affine coordinates mentioned in Sec. VII.9). 

2. Degrees of Freedom. We have seen that different coordinate 
systems can be introduced in space: A general property common to 

all the systems is that the 
position of a point in space 
is specified by three coordina- 
tes whereas the position of 
a point in a plane is specified 
by two coordinates and a point 
on a curve by one coordinate. 
This property can be expres- 
sed in the following terms: 
there are three degrees of free- 
dom when we choose a point 
in space (the geometric space) 


Fig. 205 Fig. 206 

or when a point moves in space whereas there are two degrees of 
freedom when we choose a point belonging to a plane (and also to 
an arbitrary surface) and one degree of freedom when we consider 
a point on a curve. Or, in other words, the geometric space is 
three-dimensional whereas surfaces are two-dimensional and 
curves are one-dimensional. 

In the general case the notion of a degree of freedom is introduced 
in the following way. Let there be a certain set of objects (in the 
above example such a set was the totality of all the points in space). 
Suppose that each of the objects can be specified by indicating nume- 
rical values of some parameters (in the above example such para- 
meters were the coordinates of a point). Let these parameters satisfy 
the following requirements. 
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(1) The parameters are independent, that is they can assume 
arbitrary values. For instance, if we fix all the parameters but one 
then this single parameter can be varied arbitrarily (sometimes 
within certain limits). 

(2) The parameters are essential, that is any variation of the 
parameters leads in fact to certain changes of the object in question. 

If these conditions are satisfied and if there are k such parameters 
then w’e say that we have k degrees of freedom in choosing an object 
from this set. The set of objects in question is then called a k-di- 
mensional space (generalized space) or a k-dimensional manifold. The 
parameters are called coordinates (generalized coordinates) in the 
space. As in the case of ordinary coordinates, generalized coordinates 
can be chosen in different ways, and a specific choice of coordinates 
is usually made so that it should be convenient for the investigation. 
The objects which constitute a space are called its elements or points. 
Hence, a many-dimensional space acquires a concrete interpretation. 

The above definition of a dimension is in agreement with the 
definition of the dimension of a linear space given in Sec. VII. 19 
because in such a space the coefficients of the resolution of a vector 
with respect to a fixed basis can be taken as parameters. Bnt now 
we consider spaces which belong to a more general class, and the only 
connection between the elements of such a generalized space is 
that these objects are taken from a certain set. We can introduce 
the notion of closeness of the elements if we consider any elements 
whose parameters are close to each other as being close in the space 
in question. If such a notion is introduced in a space we can easily 
define the notion of a limit in this space. Such a space in which the 
notion of closeness of its elements is introduced (or. which is the 
same, the notion of passage to a limit is defined) is called a topological 
space. 

Let us consider some examples. Let the set of all the circles lying 
in a plane be considered. Each of the circles is completely determined 
by the numerical values of three parameters, namely by the coor- 
dinates (x, y) of its centre and by its radius r. These parameters are 
independent (each of them can he varied arbitrarily) and essential 
(every variation of one of the parameters leads to a certain variation 
of the circle in question). Therefore, when we choose a circle in 
a plane we have three degrees of freedom, and hence such a set of 
circles is a three-dimensional generalized space with the coordinates 
x, y and r. Similarly, the set of all spheres in space is a four-dimen- 
sional space. 

In physics we usually consider the set of events. An event is 
completely characterized if we can answer the questions “where does 
the event take place? ’’ and “when does the event occur? ’’ . We can answer 
the first question by indicating the corresponding coordinates, for 
instance, the Cartesian coordinates x, y and z. The second question 
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is answered if we indicate the corresponding moment of time t. 
The space of events is therefore four-dimensional, and we can choose 
the quantities x. y . z and t as generalized coordinates in the 
space. 

One more example: what is the number of degrees of freedom when 
a line segment of given length Z moves in space? Each segment of 
this kind is completely specified by the coordinates fo, y u z t ) and 
( Xa , y n , s„) of its end-points. These coordinates can he taken as para- 
meters which specify the position of the segment. These parameters 
are obviously essential but not independent since they are connected 
by the equation 

'V — x i)' 4 ~ (Z/2 — (^2 ‘ Zj)“ — l 

implied by formula (VII. 14). Hence, only five parameters can be 
regarded as being independent. If we arbitrarily choose five of the 
parameters then the sixth parameter is expressed in terms of the 
five parameters with the help of the above relation. Thus, a line 
segment of given length has five degrees of freedom when it moves 
in space. 

Generally, if we have n parameters which are essential hut are 
connected by m independent equations (that is by equations such 
that none of them is implied by the others) then we can choose 
n — m parameters as independent parameters, and the remaining 
m parameters will be expressed in terms of the former. Hence, there 
will be n — m degrees of freedom. For instance, when a triangle 
moves in space we have 9—3 = 6 degrees of freedom (check up the 
result!). This example is important in connection with the fact 
that the position of a rigid body (“perfectly rigid body”) is completely 
defined if we indicate the positions of its three points which do not 
lie on the same straight line (why?). Consequently, we have six 
degrees of freedom when we investigate the motion of a rigid body 
in space. 

In addition, let us find the number of degrees of freedom when an 
infinite straight line moves in space. We can reason in the following 
way: if we choose two arbitrary points A and B in space (each of 
the points is defined by its three coordinates) and draw a straight 
line (P) passing through the points we shall have six parameters 
specifying the position of the line. Since these parameters are obvio- 
usly independent one may think that there are six degrees of free- 
dom here. But such a conclusion is wrong because there are cases 
heie when a certain variation of the parameters does not change the 
position of the straight line although it makes the points A and 
B move (along the line ( P ) when it occupies a fixed position). Hence, 
the condition that the parameters should be essential does not hold 
here. When the point .4 slides along the straight line ( P ) it has one 
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three unknowns of the form 

Fi & 
F* ( x , 
F' 3 (x. 


2) 

2) 

2) 


= O' 
= 0 
= 0. 
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\oint B slides alo 
' ns of the points 
the above calc 
which must 
f degrees of : 
^se the coo 

The notions of algebraic and transcendental sun^with th< 
ced as it was done in Sec. II. 7 for plane curves. As in*j[ s - ^ 1 
can also be singular cases here: imaginary surfaces, bact 
a surface and disintegration of a surface. It should \^ning 
account that in Sec. II. 8 we considered cases when ay es 
degenerate into a point, but a surface can degenerate ni^ s _ 
a point hut also into a curve. For instance, a “sphere of zer-^] 
is nothing but a point whereas an infinite “circular cylinder'' 
diameter’ is a straight line etc. 

4. Cylinders, Cones and Surfaces of Revolution. For ex 
let us take the equation z — x 2 = 0. If we consider the cor 
ding geometric figure as a plane curve then the equation rep 
a parabola ( L ) with the equation z = x z lying in the x, z-pla 
see that point 0 with the coordinates x = 0 and z = Q and tin 
A with the coordinates x — 2 and z = 4 belong to the pa 




But we can consider the same equation with respect 
spatial coordinate system x, y, z and then we obtain a cyli 
surface depicted in Fig. 207 for which the relation z — 2 ? 
its equation. The parabola serves as the directing curv * of 
linder, and its elements are parallel to the y- axis. This is a pi 
cylinder. Indeed, besides the point A (2, 0, 4), the points (f 
and (2, — 8, 4) also belong to the surface. Moreover, all the 
with the coordinates (2, y, 4) belong to the surface for an ai 
y since these coordinates satisfy the equation under consid 
because the equation does not involve y and therefore y can 1 
trary and it is only x and z that should satisfy the equation. Bi 
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We sometimes encounter the problem of finding the projection 
(Tf) of a given curve ( L ) lying in space on one of the coordinate planes. 
As an example, see Fig. 210 where the projection on the plane %Oy 
is shown. To solve the problem we must determine the relationship 
between the coordinates x and y of . the points belonging, to ( L ). If 
(L) is represented by equations (2) then in order to find ( L ) we must 
eliminate z from the equations. If (L) is represented parametrically 
in form (3) then we can simply retain the first two equalities. 

6. Parametric Representation of Surfaces in Space. Parametric 
Representation of Functions of Several Variables. It was shown in 
Sec. 5 that to obtain a parametric 
representation of a curve we must 
introduce one parameter [see for- 
mula (3)]. Now let us find out the 
geometric meaning of equations of 
the form 

x = cp (u, v), y — Up {u, v), 

z = X (“> v ) (4) 

containing two independent para- 
meters u and v which can take 
on arbitrary numerical values. It 
is natural to expect that such 
equations represent a surface in 
space since a point on a surface has two degrees of freedom and 
therefore in order to specify such a point two parameters should he 
indicated. 

To justify this supposition let us take any two equations (4), 
for instance, the first two equations. Generally speaking, we can 
express u and v from these equations (at least theoretically) in terms 
of x and y. Thus we obtain expressions of the form u = u (x, y) 
and v — v (x, y). If we then substitute them into the third equation 
(4) we shall arrive at an equation of the form z — z (x, y) which, 
as it was shown in Sec. IX. 1, represents a surface in space. Conse- 
quently, equations (4) represent a surface in space in parametric 
form. 

We shall show in Sec. XI. 14 that there are some singular cases 
when the above transition from (4) to the formula z = z (x, y) is 
impossible. These are the cases when the surface in question degene- 
rates into a curve or into a point. For instance, the “surface” x = 

, = 11 + y = 2u + 2v, z = 1 — a — v is in fact a line (why is 
it so?). 

A sketch of a surface ( S ) represented by equations (4) is depicted 
m big. 211. If we make u assume different constant values and if 
we vary v then we obtain different curves lying on (5), each of these 
curves being completely specified by the corresponding constant 
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revolution in question. For example, the equation z = ay 2 is the 
equation of a parabola lying in the plane yOz. Therefore the equation 
z = a {x- + V~) is the equation of the corresponding paraboloid 
of revolution. 

5. Curves in Space. A curve in space can be regarded as a line 
of intersection of two surfaces (that is as the locus of points common 




to both surfaces) or as a trace (trajectory) of a moving point. In the 
first case the equations of both surfaces can be put down in the form 

F i (x, y, z) = 0 1 
F 2 {x, y, z) = 0 / 

Since the points of the line of intersection of the surfaces belong 
simultaneously to both surfaces this line is the locus of points whose 
coordinates simultaneously satisfy both equations (2). Hence, 
system (2) should be regarded as a system of two equations in three 
unknowns. 

From the point of view of the second approach the equation of 
a curve has a parametric form 

* = 9 (<)» y =$ (f). z = 7. (t) (3) 

(see Sec. VII. 23). To pass from form (3) to form (2) we must eliminate 
t from equations (3) (for instance, we can express t in terms of x 
from the first equation and then substitute the result into the other 
two equations) if it is required and if it is possible. To perform the 
reverse transition from (2) to (3) (on the same conditions) we can 
denote one of the variables by t and then solve equations (2) for 
the other two variables. For instance, we can ‘substitute x = t 
and then solve the equations for y and z expressing these variables 
as functions of t. 
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We sometimes encounter the problem of finding the projection 
(L') of a given curve ( L ) lying in space on one of the coordinate planes. 
As an example, see Fig. 210 where the projection on the plane xOy 
is shown. To solve the problem we must determine the relationship 
between the coordinates x and y of the points belonging, to ( L ). If 
(L) is represented by equations (2) then in order to find ( L ) we must 
eliminate z from the equations. If (L) is represented parametrically 
in form (3) then we can simply retain the first two equalities. 

6. Parametric Representation of Surfaces in Space. Parametric 
Representation of Functions of Several Variables. It was shown in 
Sec. 5 that to obtain a parametric 
representation of a curve we must 
introduce one parameter [see for- 
mula (3)j. Now let us find out the 
geometric meaning of equations of 
the form 

x = qp (u, u), y = tJi {u, v), 

z — x (lb v) (4) 

containing two independent para- 
meters u and v which can take 
on arbitrary numerical values. It 
is natural to expect that such 
equations represent a surface in 
space since a point on a surface has two degrees of freedom and 
therefore in order to specify such a point two parameters should be 
indicated. 

To justify this supposition let us take any two equations (4), 
for instance, the first two equations. Generally speaking, we can 
express u and v from these equations (at least theoretically) in terms 
of x and y. Thus we obtain expressions of the form u — u {x, y) 
and v = v (x, y). If we then substitute them into the third equation 
(4) we shall arrive at an equation of the form z = z (x, y) which, 
as it was shown in Sec. IX. 1, represents a surface in space. Conse- 
quently, equations (4) represent a surface in space in parametric 
form. 

We shall show in Sec. XI. 14 that there are some singular cases 
when the above transition from (4) to the formula z = z {x, y) is 
impossible. These are the cases when the surface in question degene- 
rates into a curve or into a point. For instance, the “surface” x = 

u rt" v ’ y ~ ^ u "l - ^ v ’ z — 1 — u — v is in fact a line (why is 
it so?). 

• 4- Sket o C ) h . ° f a surface ( 5 ) represented by equations (4) is depicted 
m rig. 211. If we make u assume different constant values and if 
we vary v then we obtain different curves lying on (5), each of these 
curves being completely specified by the corresponding constant 
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value of u. This is so because when u is fixed and only v is varied 
we have a single parameter, i.e. only one degree of freedom. Similarly, 
putting v — const we obtain another family of curves on ( S ). These 
curves can he taken as coordinate curves on ( S ), and the parameters 
u and v can be regarded as coordinates on ( S ). 

Equations (4) define a certain functional relationship between 
x, y and z. Actually, if the values of x and y are given we can (at 
least theoretically) determine the corresponding values of u and v, 
as it. was done above. The values u and v thus found yield the cor- 
responding value of z. Hence, we obtain a function z = z (x, y) 
which is originally represented in parametric form (4) and whose 
graph is the surface (S) considered above. 

In the general case of an arbitrary number of variables the para- 
metric representation of a functional relationship is introduced in 
the following way. Let there be given the equations 

Ti = /i (tj, to, . . ., f m ) 

Xn / 2 {t\, to* * * ijn) I 

f fa) 

X n = f n (tj, to, . . ., t m ) J 

where the variables t t . t 2 , . . ., t m are considered to be parameters. 
If m < n then choosing m equations we can express t u to, . . t m 
from these equations in terms of the corresponding variables x. 
This is usually possible with the exception of some cases of degene- 
ration which will be discussed in Sec. XI. 14. Substituting the ex- 
pressions thus obtained into the remaining equations (5) we get a repre- 
sentation of n — m variables of type x as functions of the other m va- 
riables x. Hence, we can say that equations (5) represent an m- 
dimensional manifold lying in the n-dimensional space. When we 
choose a point belonging to such a manifold we have m degrees 
of freedom. In those cases when there is a degeneration the dimen- 
sion of the manifold turns out to be less than m. If m ^ n then, 
generally speaking, equations (5) do not define any functional 
relationship between the variables x. 

The derivatives of functions represented parametrically are found 
by analogy with Sec. IX. 13 where we differentiated implicit func- 
tions. For example, let a function z = z (x, y) defined by formu- 
las (4) be considered and let it be necessary to compute the derivative 
z' x . Rewriting the first two equations (4) in the form cp (u, v) — x = 
= 0, ij> (u, v) — y = 0 we can find u x and v' x as it was done for 
equations (IX. 17) (by the way, in practical calculations this can 
be performed without rewriting the equations). The condition 
guaranteeing the possibility of such computations is 

|<P« To 
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Differentiating the last equation (4) we obtain the formula z, — 
= xiw'v + y>V Finally, substituting tbe values u x and v» found 
above into tbe formula we obtain the desired expression of z x . Star- 
ting from this result we can find the derivatives of higher order by 
means of repeated differentiation. 


<<> 3. Algebraic Surfaces of the First and of 
the Second Orders 

7. Algebraic Surfaces of the First Order. The general form of an 
equation of a surface of the first order is put down as 

Ax + By + Cz + D = 0 (6) 

(compare with Sec. II. 9). To find out what surface is defined by such 
an equation let us introduce the vector 

a = Ai -f- B] -f Civ (7) 

Then equation (6) can be rewritten in the form a-r -f D =0 [see 
formulas (VII. 7) and (VII. 12)] where r is the radius-vector. But 
a-r = o proj a r [see formula (VII. 4)] which implies 

a proja r -f D — 0, i .e. proj a r = — ~ 

Thus, the surface is the locus of all points M for which the projec- 
tions of their radius-vectors on the constant vector a have the con- 
stant value — . Fig. 212 shows that this is a plane ( P ) which is 

perpendicular to the vector a. 

Hence, surfaces of the first order are planes. 

Let us consider some simple problems. 

(1) Investigate in what way variations of the values of the coef- 
ficients A. B , C and D affect the position of the plane (P). This can 
be seen from Fig. 212. For instance, if we vary D retaining some 
constant values of A , B and C the plane will be in a translatory 
motion. In particular, for D — 0 it passes through the origin of 
coordinates. Variations of the coefficients A. B and C result in 
rotating the vector a and, consequently, in rotating the plane ( P ). 
If A = 0 then the vector a lies in the plane yOz , and the plane (P) 
is therefore parallel to the z-axis. If, in addition, D = 0 the plane 
will pass through the z-axis. 

The case when other coefficients turn into zero are investigated 
similarly. 

Let us also put down the equations of the coordinate planes: 
s = 0 is the equation of the plane xOij , z = 0 is the equation of 
the plane yOz and y = 0 is the equation of the plane zOx. 
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in three unknowns. Each of the equations is the equation of a plane 
in the x. y, z-space, and hence the problem reduces to finding the 
point of intersection of the corresponding planes (i^), (P 2 ) and 
(P 3 ). The determinant D of the system is equal to the triple scalar 
product of the three corresponding vectors -which are perpendicular 
to the planes (see Sec. VII. 15). If D ^ 0 these vectors are not paral- 
lel to the same plane, and the plane ( P 3 ) will therefore intersect 
the line of intersection of the planes (P j) and ( P 2 ) at a single point. 
Hence system (VI.5) has a unique solution. If D = 0 the vectors 
are parallel to a plane T (see Fig. 214). This implies that the planes 
(Pi), (P 2 ) and ( P 3 ) are parallel to a straight line'(Z), and therefore 
they either have no points in common at all or have infinitely many 
common points which constitute a straight line parallel to ( l ). 
In the first case the system has no solutions and in the second it 
has infinitely many solutions (the whole “straight line of solutions"). 
Possible dispositions of the planes are shown in Fig. 215 for both 
cases. (Let the reader think what other dispositions can be found 
here.) 

8. Ellipsoid. We shall begin with the canonical equation of an 
ellipsoid without giving its geometric definition. The equation is 
of the form 



( 10 ) 


where a, b and c are positive constants called semi-axes of the ellip- 
soid. 

By analogy with Sec. II. 10, we can easily verify that | x | ^ a, 

| y | ^ b and | z | ^ c, that is an ellipsoid is a finite, bounded 
surface, that the planes xOy, yOz and z Ox are its planes of symmetry 
and that the origin of coordinates is its centre of symmetry (the 
centre of the ellipsoid). 

To investigate the form of an ellipsoid let us apply the method 
of parallel sections. The method consists in investigating the curves 
of intersection of the surface in question with the coordinate planes, 
that is with the planes whose equations are of the form x — const, 
y — const and z = const. Let us first consider the curve of inter- 
section of our ellipsoid with the plane z = /iywhich is parallel to^, 
the plane xOy. For this purpose let us put z = h in equation (10) 
which results in 

a 

a 2 ‘ b- ~ 1 c- 


or 
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f h 2 i 

Hence, we obtain an ellipse with semi-axes ay 1 — and 

b -j/i __ *1. Thus, for h = 0 we have an ellipse with semi-axes a 

and b. When j h | is increased the ellipse decreases but remains 
similar to the original ellipse because the ratio of its semi-axes 
is constant. When h assumes the values h — ±c the ellipse dege- 
nerates into a point since its semi-axes become equal to zero. The 
investigation of the curves of intersection of the ellipsoid with 
the planes y = h and x = h yields similar results. Hence we con- 
clude that the ellipsoid has the form shown in Fig. 216. 

If two semi-axes are equal, for instance, if a — b then the curves 
of intersection with the planes z — h are circles. Hence, in this 



Fig. 216 

case we obtain an ellipsoid of revolution, that is a surface generated 
by the revolution of an ellipse about one of its axes, instead of a 
triaxial ellipsoid which we have in the general case when the axes 
are unequal. An ellipsoid of revolution is also referred to as a sphe- 
roid. When a spheroid is generated by revolving an ellipse about 
its major axis it is called a prolate ellipsoid of revolution (it resem- 
bles an egg). If an ellipse rotates about its minor axis we have an 
oblate ellipsoid of revolution. Finally, if all the three axes are equal 
to each other the ellipsoid turns into a sphere. 

By analogy with Sec. II. 10, we can easily prove that a triaxial 
ellipsoid can he obtained by performing uniform contraction (or 
stretching) of a sphere towards two coordinate planes. The con- 
traction towards one of the coordinate planes necessarily results 
in a spheroid. At the same time, Sec. XI. 6 implies that when per- 
forming uniform contraction of an ellipsoid we again obtain an 
ellipsoid. 


21 * 
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9. Hyperboloids. There are two types of hyperboloid. A hyper- 
boloid of one sheet is represented by the canonical equation 


X 1 J IJ 1 z 2 

a 2 I Ifi C 2 


(11) 


The section of the surface by the plane z — h yields an ellipse whose 

semi-axes are a ]/ 1 -j- 4 and b j/" 1 -J- (check it up!). Hence, 

for /i = 0 we have an ellipse with semi-axes a and b. When | h | 
increases the sizes of the ellipse also increase and tend to infinity, 
as | h | -i- oo. All the ellipses are similar because the ratio of their 

semi-axes equals ■— = const. We similarly verify that the sections 

by the planes y = h and x = h are hyperbolas. Hence we obtain 
a surface which is depicted in Fig. 217. A hyperboloid of one sheet, 
like an ellipsoid, has three planes of symmetry and a centre of 
symmetry. 

If a = b we have a surface which is generated by revolving a 
hyperbola about its conjugate axis, that is a hyperboloid of revo- 
lution (of one sheet). 

There is another way of interpreting a hyperboloid of one sheet. 
Let us first take the case when the hyperboloid of one sheet having 
equation (11) is a surface of revolution, that is when a = b. [Equa- 
tion (11) turns into 


X 2 



(a = b) 


for this case.] Take the plane y = b = a. Let us consider the section 
of the hyperboloid in question by this plane. For this purpose we 
substitute y = b = a into equation (11) which results in 


4— -V = 0 or (— — - i) (- + -) =0 

a 2 c- \ a c j \ a ' c ) 


i.e. 


yr — v=° and T+f = 0 (•'/■=&) 


Hence, the curve of intersection disintegrates into a pair of straight 
lines — — = 0, y'= b and — + — = 0, y = b which inter- 

sect at the point A (0, 6, 0) (the lines are shown in Fig. 218). Be- 
cause of the axial symmetry we have the same form of section by 
any plane which is parallel to the z-axis and which touches the 
“gorge’' circle (the ellipse corresponding to the section z = h — 0 
is called the gorge ellipse; in the case a — b we thus have the gorge 
circle). Consequently, the whole hyperboloid is entirely made up 
of -these straight lines forming two families of straight lines, as it 
is shown in Fig. 218, because through each of its points there pass 
two straight lines which lie entirely on the surface of the hyperboloid. 
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This- property is analogous to the properties of a cylinder or of a 
eone which are also made up of straight lines (their elements), but 
in the latter cases the straight lines belong to a single family of 
lines. Incidentally, we see that a hyperboloid of revolution (of one 
sheet) can be generated by revolving one of the two skew lines about 
the other. Now, passing to the general case of a hyperboloid of one 
sheet when the parameters a, b and c entering into (11) can take on 




Fig. 21S 


arbitral positive numerical values we see that such a hyperboloid 
can be obtained from a hyperboloid of revolution if we uniformly 
contract the latter. But it is obvious that straight lines pass into 
straight lines under such a deformation, and therefore we conclude 
that a hyperboloid of one sheet of the general form is also made up 
of straight lines forming two families of straight lines. In conclusion, 
let us note that the plane depicted in Fig. 218 is the tangent plane 
to the l^perboloid at the point A. In fact, the tangent plane passing 
through a point A belonging to an arbitrary surface ( S ) is, by de- 
finition, a plane which touches any curve lying on (S) and passing 
through the point A. Therefore the tangent plane to our hyperboloid 
must pass through both straight lines which entirely lie on the 
hyperboloid and intersect at the point A. Hence, we see that a 
tangent plane to a surface may intersect the surface along two dis- 
tinct lines. 

The canonical equation of a hyperboloid of turn sheets has the form 
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The section by the plane z = h is an ellipse with semi-axes 
a]/^r— 1 and 1 


The plane z — h does not therefore intersect the surface for | h \ < 
< c. For \h \ — c, that is for h = ±c , we obtain ellipses whose 

axes are equal to zero, that is single 
points. 'When | h j increases from c to 
infinity the sizes of the corresponding 
ellipses also tend to infinity. The ratio 
of the semi-axis being equal to the con- 
stant y , all the ellipses are similar. 

The intersection with the planes y — h 
and x = h yields hyperbolas. Hence we 
obtain a surface which consists of two 
distinct portions (“sheets’ - ) which extend 
to infinity. The surface is depicted in 
Fig. 219. If a — b we obtain a hyperbo- 
loid of revolution generated by revolving 
a hyperbola about its transverse axis. 

10. Paraboloids. There are also two 
types of paraboloid. An elliptic parabo- 
Fig. 219 Ioid has the canonical equation 


p 

i 

■ 

Bn 

mk 

k 

| 


y 


z — ax 2 + by 2 ( a , b >0) 


The section by the plane z = h is a curve represented by the equation 


ax 2 -{- by 2 — h which can be put down as 


1. Hence, 


(!) (!) 

for h > 0, we obtain an ellipse with semi-axes j/^ y and j/^ y . 


There will be no intersection for h <_ 0, and we shall have a single 
point (the origin of coordinates) for h = 0. When h increases from 
0 to infinity the sizes of the ellipses tend to infinity, and all the 
ellipses are similar because the ratio of their semi-axes assumes 


the constant value j/" y : y = . The sections by the 

planes y = h and x — h are parabolas. Hence, we obtain a sur- 
face which is shown in Fig. 220. The paraboloid has two planes of 
symmetry (namely, the planes x — 0 and y = 0). If a = b we have 
a paraboloid of revolution generated by revolving a parabola about 
its axis. 
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A hyperbolic paraboloid has the canonical equation 

s = — ax- + by 2 ( a , b >0) ' (12) 

Intersecting the surface by the plane i = 0we obtain the parabola 
s = by- Vhich opens upwards (in the positive direction of the z- 
axis). On the contrary, the sections by the planes y ~ h are the 
parabolas z = — ax 2 -f bh 2 which open downwards. The sections 
are shown in Fig. 221. Finally, the sections by the planes z = h 
are hyperbolas. Thus, we obtain a surface having the form of a saddle. 

It can be shown that this surface, like a hyperboloid of one sheet, 
is entirely made up of straight lines forming two families. For 
instance, the tangent plane to the hyperbolic paraboloid (shown 




in Fig. 221) passing through the origin of coordinates is the plane 
z = 0. But at the same time, if we put z = 0 in equation (12) we 
get y ax = by which means that the tangent plane intersects 
the surface along two straight lines. 

11. General Review of Algebraic Surfaces of the Second Order. 
The general form of an equation representing a surface of the second 
order can be written as 

Ax 2 + 2 Bxy -j- Cy 2 + 2 Dxz + 2 Eyz -j- Fz- + 

+ Gx + Hy -}- Iz + / = 0 (13) 

(compare with Sec. 11.13). 

We shall show in Sec. XI. 12 that we can always perform a rota- 
tion of the original Cartesian axes so that the equation related to 
the new coordinates should no longer contain the terms involving 
the products of different coordinates. Therefore if we denote the new 
coordinates as x ' , y' and z' the equation will not contain the products 
x'y', x'z' and y’z' and thus it will have the form 

A’x'- + CY* + F'z'2 + G'x' + H'y' + I'z' + J' = 0 (14) 
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where A', C ' , F ' , (?', i?', /' and J' are some new constant coef- 
ficients (compare with Sec. 11.13). Depending on the signs of the 
coefficients A', C' and F’ we perform the further investigation in 
different ways. Let us first suppose that all these coefficients are 
different from zero and have the same sign. For definiteness, let 
them be positive. Then, as it was done in Sec. 11.13, we complete 
the squares and perform a parallel translation of the coordinate 
axes which leads to a new equation of the form 

A'x" 2 + C'y " 2 + F'z" 2 + /' = 0 
i.e. 



A' C' F' 


where x" . y" and z" are the new coordinates appearing after the 
parallel translation. It follows that if /' < 0 we get the canonical 
equation of an ellipsoid, and therefore the original surface (13) 
is also an ellipsoid whose planes of symmetry and centre of symmetry 
are displaced and turned relative to the coordinate planes and the 
origin of coordinates of the original coordinate system x, y, z. 
But if J' >0 or /' = 0 we obtain, respectively, an imaginary 
surface or a single point. Similar results are obtained for the case when 
the coefficients A', C’ and F’ are negative. 

We suggest that the reader should verify that when the coeffi- 
cients A\ C' and F' are different from zero but have unlike signs 
the corresponding surface will be either a hyperboloid or a cone 
of the second order. We usually call such a cone elliptic although 
its sections, by different planes, can be ellipses, hyperbolas or 
parabolas (see Fig. 86). In particular, there can also be a circular 
cone. 

If exactly one of the coefficients A', C' and F' entering into equa- 
tion (14) is equal to zero, for instance, if F' = 0 whereas the corres- 
ponding coefficient /' is different from zero then the surface will 
be a paraboloid. It can be shown (but we are not going to prove it 
here) that in all other cases there can he only a cylinder of the second 
order, a pair of planes (which may coincide), degeneration into 
a straight line and an imaginary surface. Besides, a cylinder of 
the second order can have a directing curve which is an ellipse 
(a circle in a particular case), a hyperbola or a parabola. According- 
ly, we can have an elliptic (or circular in a particular case), a hyper- 
bolic or a parabolic cylinder. For example, the cylinder described 
in the beginning of Sec. 4 is parabolic. 
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Matrices and Their 
Applications 


§ 1. Matrices 

Matrices were first introduced by the Irish mathematician, 
W. Hamilton (1805-1865) and the English mathematician A. Cayley 
(1821-1895). They are widely used now in various branches of mathe- 
matics because their application considerably simplifies the inve- 
stigation of complicated systems of equations. 

1. Definitions. We begin with some formal definitions whose 
advisability will be clarified later. A matrix is a rectangular array 
composed of numbers or some other objects. Unless the contrary 
is stated, we shall only deal with real number matrices , that is ma- 
trices composed of real numbers. For instance, such a matrix can 
have the form 


2 —1.3 0 

1 it 1 


2 1 0 
•3 V2 0 

2 —1 w 


or (5) etc. 


Here the parentheses are the sign of matrix. Double vertical bars 
are also used for this purpose (that is the notation of the form, 

1 

2 

q etc.) but not the simple vertical bars- 
3 

which designate a determinant (see § VI. 1). Like in the theory of 
determinants, we consider elements, rows and columns of matrices. 
But there is an important difference between a determinant and 
a matrix: a determinant is equal to a certain number (see Sec. VI. 1). 
whereas a matrix is regarded as an independent object which is not 
reduced to a simpler object (such as a number and the like). For 
brevity, we can designate a matrix by a single letter, for instance, 
by A, B etc., but then the letter A will nevertheless designate the* 


2 

1 

0 

-3 

V 2 

0 , 

2 

-1 

1 


2 
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whole array of numbers. A matrix can be put down in the general 
form as 


a li 

a a • • • 

a in \ 


a 2l 

a^2 

a 2n 1 

( 2 ) 

&mi 

d-mz • • • 

Omn / 



It is convenient to equip the elements of a matrix with two indices 
under the convention that the first index indicates the number of 
the row and the second the number of the column in which the 
element appears. We shall sometimes use the abridged notation 
A = iflij ) mn which means that i varies from 1 to m and / from 1 to n. 

Every matrix is characterized by its numbers of rows and columns. 
A matrix having m rows and n columns will he referred to as an 
(m X n) matrix. For instance, in formulas (1) and (2) we have, 
respectively, (2x3), (3x3), (4x1), (1 X 1), and (ttv X n) 
matrices. If the number of rows coincides with the number of co- 
lumns the matrix is said to be a square matrix. In this case the 
number of its rows and columns is called the order of the matrix. 
A square matrix of the first order is identified with its single ele- 
ment. For instance, the fourth matrix in (1) is simply the number 5. 

A matrix consisting of a single column is called a column matrix 
or a number vector (column-vector). Such a matrix is identified 
with a vector belonging to a Cartesian space of number n-tuples 
-(see Sec. VII. 18). Thus, the third matrix in (1) is a vector of the 
space E,„ the coordinates of the vector being 1, — 2, 0, 3. A matrix 
having only one row is called a row matrix. 

A matrix whose all elements are equal to zero is called a zero 
matrix. A square matrix whose all elements are equal to zero pos- 
sibly except those forming its principal diagonal (that is the diago- 
nal connecting the left uppermost element with the right lower- 
most element) is called a diagonal matrix. If the diagonal is formed 
by the elements a, b, . . ., k the diagonal matrix is denoted as 
•diag (a, b, . . ., k). If all the elements of a diagonal matrix forming 
its principal diagonal are equal to unity the matrix is referred to 
as a unit matrix. Such a matrix is usually designated by the letter I. 
For example, the unit matrix of the third order is of the form 

/ 1 0 0 \ 

I- 0 1 0 Udiag (1, 1, 1) (3) 

\0 0 1 / 

The so-called operation of transposition consists in interchanging 
rows and columns of a matrix with the same indices. Such an ope- 
ration was applied to determinants in Sec. VI. 2. If we have a matrix 
A then the transposed matrix (the transpose of A) will be designated 



MATRICES AND THEIR APPLICATIONS 


331 



as A*. For instance, 

2 0 3)* 

7 -2 11 


In the general case we can write afj — an (why?). Obviously, we 
always have (A*)* = A. _ # 

A matrix coinciding with its transpose is called symmetric. Or 
course, only a square matrix can be symmetric. The symmetry 
condition can be put down in the form an — an- 
il we have an — — an for all the elements of a matrix then the 
matrix is called skew-symmetric (antisymmetric). 

A square matrix A has its determinant which we shall denote by 

1 0 ^ = — 3. Only square 


det A. For example, det. ( * _ 3 ) 


2 —3 


matrices have determinants; it is impossible to speak about the 
determinant of a rectangular matrix which is not square. It follows 
from Sec. YI.2 that 

det 1 = 1 and det A* = det A 


2. Operations on Matrices. Two matrices with the same numbers 
of rows and columns are added together according to the following 
rule: 


(an «J2 «i3\ (bn b l2 &i 3 \ /®n + &n a i2 + &i2 a i3~l~fri3 

\ a 21 a 22 ^ 23 / 1^21 ^22 & 23 / \ c 21 + &21 a 22 + &22 c 23 + &23 


The multiplication of a matrix by a number is defined in an analo- 
gous way: 

( a u «i2 « 13 \ ( ka n ka l2 kai 3 \ 

\ a 2i q 22 a 23 j \ka 2l ka^z ka 23 J 

We can easily verify that all the axioms of linear operations (see 
Sec. VII. 17) hold for the above operations. Hence, the set of all 

matrices of the same size is a linear space. Let us put down the 

following obvious formulas: 

(A + B)* = A* + B*, ( kA )* = kA* and 
det ( kC ) = k n det C 

where n is the order of the square matrix C. By the way, in the 
general case we have det (A + B) det A -f- det B. 

Now we are going to introduce the rule of- multiplication of a 
matrix by another matrix. The advisability of this peculiar rule 
will be clarified in Sec. 6. First of all; for two given matrices to be 
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multiplied by each other, it is necessary that the number of columns 
of the first matrix factor be equal to the number of the rows of the 
second factor; if otherwise, the multiplication is impossible. This 
condition being fulfilled, the product is found according to the rule 



/Ujlhli + <212^21 <2 11^12 4“ a 52^22 a !1^13 "T 12 12^23 

l fl 2lhn + <222^21 a 2lbi2 4" a 22^22 <221^13 4“ <222^23 


The reader should pay much attention to the structure of the for'mil- 
la. For instance, to obtain the element of the product belonging to 
the first row and to the third column we must take the first row of 
the first factor and the third column of the second factor and then 
multiply them as if we computed the scalar product of the corres- 
ponding number vector [see formula (VII. 12)]. Other elements of 
the matrix which is the product of the two given matrices are also 
obtained by means of a similar operation resembling scalar multi- 
plication of the rows of the first matrix by the columns of the second 
matrix. In the general case when we multiply an ( m X n) matrix 
(i aij ) by an ( n X p) matrix ( bij ) we obtain an {m X p) matrix (c,j) 
whose elements are found according to the formula 

n 

Cij — 2 

h=l 

The above rule implies that we can always mutually multiply 
two square matrices of the nth order which results in a square matrix 
of the same order. In particular, we can always multiply a square 
matrix by itself, that is we can raise it to the second power, but 
this cannot be done with a non-square rectangular matrix. There 
is another important particular case when we multiply a row matrix 
by a column matrix under the condition that they contain the same 
number of elements; this yields a square matrix of the first order, 
that is a number: 

( M 

( a l 0-2 a 3)'l b 2 1 — Ojhj -p <Z 2 &2 + a 3^3 

\bj 

By analogy with Sec. VI. 2, we can verify the following proper- 
ties of the product of matrices: 

(kA) B = A (&B) = k (AB), (A + B) C = AC + BC, 

C (A + B) = CA + CB, A (BC) = (AB) C 

Of course, in all these formulas we suppose that the numbers of 
rows and columns of matrices entering into these expressions gua- 
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rautee the possibility of the corresponding multiplications. Another 
method of deducing the formulas will be given in Sec. 6. 

The simplest examples indicate that, generally speaking, the mul- 
tiplication of matrices is non-commutative, i.e. AB BA. Let 
the reader verify the following relations: 


(1 2 )-( 


0 

0 

0 

-2 


0 
0 

= —4, 


0> 

0, 


:). c :h: k 


Besides, we have 
1 - 2 ’ 
-1 -3. 


0 

-2 
1 


•(1 2) 
r-3\ 


0 

-2 


(JM-i =i) 


( i — 3 ) ‘( 2 ) = (— 7 ) w ^ ereas the expression 

makes no sense. The non-commutativity of 

matrix multiplication makes it necessary to keep the order of 
factors. Therefore, to specify the order, we say “to multiply A on 
the right by B” or simply “to multiply A by B” (the operation results 
in AB) but when speaking about the product BA we say “to multiply 
A on the left by B”. 

We also indicate the property 

(AB)* = B*A* (4) 

which can be easily verified, and the property 

det (AB) = det A -det B (5) 


which will be proved in Sec. 7. 

If A is a complex number matrix the symbol A* designates the 
result of the operation of transposition with the simultaneous repla- 
cement of all the elements by their complex conjugates. In this 
case A* is said to be the transposed conjugate matrix. The above 
formulas will hold for complex matrices if we change two of the 
formulas, namely if we write det A* = (det A)* and (/cA)* = k* A*. 

3. Inverse Matrix. Here we shall consider square matrices. For 
definiteness, let us take matrices of the third order. The role of 
unit matrix (3) in the operation of multiplying matrices is analo- 
gous to the role of the number 1 in the operation of multiplying 
numbers. Indeed, we can easily verify that AI — IA = A for any 
matrix A. 

By analogy with number multiplication, we define the notion 
of a matrix A -1 which is the inverse of the matrix A: by definition, 
we put 

A -1 A = AA _1 = I (6) 

From (6) and from equality (5) it follows that 
'.) det A • det (A -1 ) = det I 1 , that is det(A- 1 ) = 


det A 
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We see that for the inverse matrix to exist it is necessary that det A 
be unequal to zero: det A 0. A square matrix A for which det A = 
= 0 is called degenerate (singular). Consequently, a degenerate 
matrix has no inverse. At the same time, every non-degenerate 
(non-singular) matrix has its inverse. Actually, let us take an arbi- 
trary non-degenerate matrix 

/«i bi 

K=( Ch b 2 c, (7) 

\ c 3 b 3 c 3 J 

Then, bearing in mind the definition of the product of matrices, 
we verify, reasoning as in Sec. VIA, that the multiplication of K 
on the left or on the right by the matrix 


1 

det IC 



■A-2 

b 2 

C 2 



( 8 ) 


yields the matrix I (the capital letters Ai, . . ., C 3 designate the 
corresponding cofactors of the elements of the determinant of the 
matrix K, see Sec. VI. 3). Matrix (8) is therefore nothing but the 
matrix K -1 . 

Inverse matrices can be applied to solving matrix equations. 
For instance, let us consider the equation AX = B where A and 
B are given matrices and X is an unknown matrix. Let us suppose 
that det A 0. Then multiplying both sides on the left by A -1 
and taking advantage of equalities (6) we deduce X = A -1 B. Simi- 
larly, the solution of the equation XA = B is X = BA' 1 provided 
det A ^ 0. 

Matrices enable us to put down a system of equations of the first 
degree in an abridged form of a matrix equation. For example, 
system of equations (VI. 5) can he rewritten in the matrix form 


/<zi b, cA (x\ f d i\ 

I a z h c 2 • y =1 d 2 | 

\ a 3 b 3 c 3 J \z / \d 3 / 

(check it up!). If we designate the coefficient matrix by the letter A r 
the column of unknowns (which is a number vector) by x and the 
column of the constant terms by d the system is put down in a still 
more abridged form as 

Ax = d (9) 

If det A ^ 0 then (9) implies that the solution is 

x = A _1 d 


( 10 ) 
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If wc write the formula in full we shall again obtain Cramer s iule 
deduced in Sec. YI.4. It should be noted that formula (9) also makes 
sense when the number of equations differs from the number of un- 
knowns but in such a case the matrix A will not he square; for such 
a system formula (10) no longer holds because only a square matrix 
has its determinant. 

Equation (9) with a concrete square matrix A and a column d 
composed of letters can be solved according to Gauss’ method (see 
Sec. VI. 5). After such a solution we come to formula (10), that is 
we obtain the matrix A -1 as the matrix of the coefficients in the 
coordinates of the vector d. This method of constructing an inverse 
matrix is practically more convenient than the application of 
formula (8) especially when the order of the matrix is large. 

Formula (6) indicates that the matrices A and A -1 are mutually- 
inverse, that is (A -1 ) -1 = A. Besides, we sometimes apply the for- 
mula (AB) -1 = B -1 A -1 (where det A =£ 0 and det B ^ 0) which, 
can be readily verified: 

(B -1 A -1 ) (AB) = B" 1 (A- 1 A) B = B -1 IB = B -1 B = I 

Finally, substituting B = A -1 into formula (4) we deduce 
(A -1 )*A* = (AA -1 )* = I* = I, that is (A -1 )* = (A*)- 1 

4. Eigenvectors and Eigenvalues of a Matrix. Let A be a given 
square matrix. As we shall see later, we sometimes encounter an, 
equation of the form 

Ax = Xx (11) 

where x is an unknown number vector and X is an unknown number,, 
the dimension of x being equal to the order of A. Equation (11) 
has the trivial solution x = 0 for any X but we shall he interested 
only in those X for which the system has non-trivial solutions. A 
number X of this kind is called an eigenvalue of the matrix A and 
the corresponding solution x of equation (11) is called an eigenvector 
of the matrix A. 

Eigenvalues and eigenvectors can be found as follows. Since 
x = lx we can rewrite equation (11) in the form 

(A — M) x = 0 (12) 

Comparing formula (12) with formula (9) we see that we have arrived 
at a system of n algebraic homogeneous linear equations in n un- 
knowns where n is the order of the matrix A. According to Sec. VI. 6, 
for a non-trivial solution to exist, it is necessary and sufficient 
that the determinant of the system be equal to zero, i.e. 

det (A — M) = 0 (13) 

This equation is called the characteristic equation of the matrix 
A and it enables us to find the eigenvalues X. For instance, in the 
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■case of matrix (7) the equation has the form 


a t — X 
<z 2 
a 3 



Cl 

Co 


= 0 


c 3 X 


"Writing the determinant in full we see that this is an algebraic 
equation whose degree is equal to the order of the matrix A. By 
Sec. VIII. 8, we conclude that a matrix of order n has n eigen- 
values. Of course, some of them may be complex and some may 
•coincide. 

If we consider only real numbers and real vectors then equation 
(11) is satisfied only in the case when we take a real root of the cha- 
racteristic equation (provided there are such). But if we admit com- 
plex numbers then every root of the characteristic equation can be 
substituted into (11). 

After an eigenvalue has been found we can determine the corres- 
ponding eigenvector by solving vector equation (12). For this purpose 
we rewrite the equation in the form of a system of scalar equations 
and apply the methods of Sec. VI.6. Equation (12) implies that, 
for a fixed X, the sum y = x 1 -f- x 2 of particular solutions x 1 and x 2 
is a solution of the same system and that the product z = kx of a 
solution x by a number k is also a solution. Hence, the set of all 
eigenvectors corresponding to a given eigenvalue is a linear sub- 
space (see Sec. VII. 18) of the space of all number vectors of dimen- 
sion n. 

The most important case here is when all the eigenvalues are dis- 
tinct. In this case the subspace corresponding to a given eigenvalue 
X is one-dimensional, that is an eigenvector corresponding to the 
eigenvalue is defined to within a numerical factor [here we also 
mean complex eigenvalues because, as it has been mentioned, cha- 
racteristic equation (13) with real coefficients can have both real 
and imaginary roots]. The fact that such a subspace must be one- 
dimensional is implied by the property that nonzero eigenvectors 
corresponding to different eigenvalues are necessarily linearly 
independent and by the definition of a dimension which indicates 
that an n-dimensional linear space cannot have more than n linearly 
independent vectors. The linear independence can be proved as 
follows: if some eigenvectors x 1 , x 2 , x 3 correspond to different eigen- 
values Xi, X 2 . X 3 and if x 1 and x 2 are linearly independent whereas 
x 3 = ax 1 -f- px 2 then multiplying both sides of the equality on 
the left by A we obtain X 3 x s = c/AjX 1 -f- p?. 2 x 2 ; after that multi- 
plying the first equality by X 3 and subtracting it from the second 
one we deduce a (7,i — ?„ 3 ) x 1 + p ( X 2 — X 3 ) x 2 = 0, which cpn- 
tradicts the linear independence of x 1 and x 2 . The case when x 1 
and x 2 are linearly dependent is treated similarly. 
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( S ) = (R) then we say that there is a mapping of the space (2?) 

into itself. If every vector y £ (S) is the image of a vector x £ (R) 
under a mapping A then we say that A maps ( R ) onto (£).] A map- 
ping is said to be linear if for any x t , x 2 6 ( R ) and for any number 
% we have 

A (x t -f- x 2 ) = Axi -f- Ax 2 and A (he) = XAx (20) 

Applying these properties several times we readily deduce 

— >- — > — > — > —*■ 

A + ^ 0^2 + . . . + Kx/,) = XiAxi + 1*2 Ax % ~r • • • 

XkAxk (21) 

Hence, a linear transformation does not change the form of a linear 
combination because the coefficients remain the same, and it is 
only preimages that are replaced by the corresponding images here. 
Hence, not only the sum of preimages goes into the sum of the 
images hut also the difference goes into the difference and so on. 
Putting K = 0 in the second equality (20) we deduce the relation 

' — > ■ — v 

AO = 0 which holds for all the linear mappings. Here we have, 
of course, the zero vector of the space ( S ) on the right-hand side 
and the zero vector of the space (R) under the sign of the linear 
mapping A on the left-hand side. 

For definiteness, let us suppose that the space (R) is three-dimen- 
sional and the space (A) is two-dimensional. Let us arbitrarily 

choose a basis p u p 2 , p 3 in (i?) and a basis q u q 2 in ( S ). Each of 
the vectors A pj belongs to (S) and it can therefore he resolved with 

respect to the basis q t , q 2 . Let us introduce the notation 


Api — a li q i -f- a Zi q 2 

— ► — — — y 

A Pz = fl 12?l + °22?2 

Apz — a i3?l -}- ®23 go 


( 22 ) 


Then, by formula (21), any vector x = x^p i + x„Jp 2 + x^p 3 £'(i?) 

goes into a vector y = Ax — y l q 1 -j- y 2 q„ £ ( S ) according to the 
following rule: 

— >- — y y ^ ^ 

y A (xiPi + x 2 p 2 -f- x 3 p 3 ) = XjApj -j- x 2 Ap 2 -f- x 3 A p 3 = 

— («nXi + a lz x 2 + a i3 x 3 ) -f- (« 2 i x i + (hpx 2 -j- a 2s x 3 ) q 2 

that is 

Vi — a ii‘H + a i2 x 2 -j- a l3 x 3 1 

l / 2 = ® 2 i a 'l "i" 0-22X2 + o 23 x 3 j ' ' 
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vectors and some other vectors by letters in ordinary type equipped 
with arrows.) In this case A is therefore the sign indicating the rota- 

tion of a vector; to each vector x there corresponds the vector Ax. 
In other words, we have defined a mapping A of the plane of vectors 
into itself. The terms a “transformation A” or an “operator A” are 

used synonymously in such a case. A given vector x is called a pre- 

" v 

image (an original, or an inverse image) and the vector Ax is called 

the image of x under the mapping A. 

A rotation transforming parallelograms into parallelograms, we 
conclude that the addition of preimages yields the addition of 

the corresponding images (see 
Fig. 222). In other words, in 
the case of rotation the image of 
a sum is the sum of the images: 

— V -v — V — V 

A (sq -f- x 2 )~ Axi + Ax 2 . We can 
similarly verify that the multi- 
plication of a preimage by a 
number yields the multiplication 
of the image by the same num- 

— v —v 

ber, i.e. A (?x) = XAx. Hence, 
under the mapping in question, 
to linear operations on the prei- 
mages there correspond analo- 
gous linear operations on ima- 
ges, that is linear relations bet- 
ween vectors remain valid after the mapping has been performed. 

For instance, if x t = 2x 2 — hx 3 then Ax t = 2Aa; 2 — 5Ax 3 etc. 
Thisfproperty is called the linearity of the mapping. 

The operation of projecting all the vectors in space on a fixed 
plane or on a straight line possesses the same properties. The verifi- 
cation of the properties is left to the reader. We can take quite 
a different example now. Let us consider the space of all polynomials 
(see Sec. VII. 18) and define the image of each of the polynomials as 
its derivative. The linearity of this mapping is implied by the facts 
that the derivative of a sum is equal to the sum of the derivatives 
and that a constant factor can be taken outside the sign of diffe- 
rentiation. 

We now proceed to give the general definition of a linear mapping. 
Let two linear spaces ( R ) and ( S ) (see Sec. VII. 17) be given. Suppose 

that there is a law, a rule, according to which to every vector x £ ( R ) 

there corresponds a certain vector y — Ax £ (S). Then we say that 
we are given a mapping A of the space (I?) into the space ( S ). [If 



z = XI + X 2 , A z = A*1 + A.T2, 

— > —v — y — > 

A(.T1 + X2 ) = AXi + AX2 
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(S) = ( R ) then we say that there is a mapping of the space (i?) 

into itself. If every vector y £ ( S ) is the image of a vector x £ (i?) 
under a mapping A then we say that A maps (R) onto (<S).J A map- 
ping is said to be linear if for any x j, x z 6 (R) and for any number 
X we have 

A + x 2 ) = A^ + Ax 2 and A (Xx) = XAa: ( 20 ) 

Applying these properties several times we readily deduce 

A (X^x^ -j- X 2 x 2 -}- . . . -j- XkXff) = XjAxj -f- X 2 Ax 2 -j- . . • 

... + XkAx k ( 21 ) 

Hence, a linear transformation does not change the form of a linear 
combination because the coefficients remain the same, and it is 
only preimages that are replaced by the corresponding images here. 
Hence, not only the sum of preimages goes into the sum of the 
images but also the difference goes into the difference and so on. 
Putting X = 0 in the second equality ( 20 ) we deduce the relation 

AO = 0 which holds for all the linear mappings. Here we have, 
of course, the zero vector of the space ( S ) on the right-hand side 
and the zero vector of the space ( R ) under the sign of the linear 
mapping A on the left-hand side. 

For definiteness, let us suppose that the space {R) is three-dimen- 
sional and the space ( S ) is two-dimensional. Let us arbitrarily 

choose a basis p t , p 2 , p 3 in (R) and a basis q u q 2 in ( S ). Each of 
the vectors A pj belongs to ( S ) and it can therefore be resolved with 

respect to the basis q u q 2 . Let us introduce the notation 


Ap t — a n q^ -j- a 2 ig 2 
->• — >• 

Ap 2 = -f- U22?2 

A?3 ~ C 13?l + ®23?2 


( 22 ) 


Then, by formula (21), any vector x — x^pi + x^p 2 + z 3 p 3 6'(H) 

goes into a vector y = Ax = y l q i -{- y„q„ £ (S) according to the 
following rule: 

-*• -V -V 

1/ ~ A [XiPi + x 2 p 2 + XzPz) = ^lApj -f- x 2 Ap 2 -f x 3 Ap 3 — 

= -j- a l 2 x 2 + gi 4- (a 2l Xi 4- a^x* + a 23 x 3 ) ?2 

that is 

V\ = 011*1 + a l 2 x 2 + a x 3X3 

If 2 — ^ 21^1 4 " ^22^-2 4 " ® 23*^3 J ' ' 



342 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


Thus, we have arrived at formulas which express transformation of 
the coordinates of a vector under a linear mapping. If we denote 


X — | 

*2 , 

r ={’•). 

A= l an 

a 12 

^13 \ 


W/ 

\y?.l 

\ a 21 

a 22 

a 23 1 


then, by Sec. 2, formulas (23) can be rewritten in the form 

y = Ax ' (24) 

The number matrix A entering into formula (24) is called the matrix 

of the linear mapping (operator) A relative to the given bases pj 

and gt since it depends not only on the mapping itself but also on 
the choice of the bases. The components of the number vectors x and 

—y — ► 

y depend not only on the vectors x and y but also on the bases. 

— v — v 

If we have two bases pj , g,- chosen in the spaces (f?) and ( S ) and 
if it is known that any vector x = aqp, + x 2 p 2 -f- x 3 p 3 £ ( R ) is car- 
ried into a vector y — Ax = gjgj -f- y t q 2 £ ( S ) whose coordinates 

—y 

y t , y 2 are defined by formulas (23) the mapping y = Ax is linear. 

When we add vectors their similar components are also added and 
the corresponding number vectors are added too; but (24) implies 
that when we add number vectors x the corresponding vectors 
y are also added. The second property (20) is verified similarly. 

— y ' — y 

Hence, if certain bases pj and q t are chosen then to every linear 

mapping of ( R ) into ( S ) there corresponds its matrix which is the 

transpose of the coefficient matrix of the expansion of the vectors 
— *• — ► 

A pj with respect to the basis q Conversely, each matrix with the 
corresponding numbers of rows and columns [a (2 x 3) matrix in 
our case] is a matrix of a linear mapping of ( R ) into (5). Evidently, 
if ( R ) is of dimension n and ( S ) is of dimension m then the matrix 
of a mapping of ( R ) into ( S ) is an ( m X n) matrix. In particular, 
when ( S ) = (R) the matrix of a linear mapping is square. In such 
a case we usually expand vectors with respect to the same basis 
before and after the mapping, unless the contrary is stated. 

The rank of a matrix A being equal to the maximal number of 
its linearly independent columns, formula (22) implies that the 
rank equals the number of linearly independent vectors among the 

vectors A pj. Thus, the rank is equal to the dimension of the linear 
space A (i?) constituted by the images of all the vectors of ( R ) under 
the mapping A. The space A ( R ) can either coincide with ( S ) or 
constitute a subspace of ( S ) having a lower dimension. As it has 
been already mentioned, in the first case we say that (R) is mapped 
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onto ( S ). In particular, it follows that although the matrix A of 
a mapping A of ( R ) into (5) depends on the choice of bases in ( R ) 
and ( S ) the rank of the matrix is independent of the choice. Besides, 
the rank of an ( m X n ) matrix not exceeding n , the dimension of 
A (P) cannot exceed that of ( R ). Consequently, we see that a di- 
mension cannot increase under a ' linear mapping (in § 4 we shall 
see that the same is true for a non-linear mapping). 

Instead of considering a mapping of vectors into vectors we can 
also consider a mapping of points into points which is more visual. 
Let us suppose that to each point M of a plane ( P ) there corresponds 
a certain point If of a plane ( P ). Then we can _say that we are given 
a mapping of the plane ( P ) into the plane ( P ). In such a case we 
shall writ el = / (M) (compare with the consideration in Sec. IX. 9 
concerning this notation). Suppose that the mapping / is such that 
it does not violate rectilinearity, that is suppose that vectors lying 
in the plane (P) are carried into vectors lying in the plane (P) under 
the mapping. In addition, let us suppose that equal vectors in the 

plane (P) go into equal vectors in the plane (P). Then we can say 
— > 

that to each vector x of the plane (P) there corresponds a completely 
— > - ... 

specified vector y of the plane (P) which is independent of the dis- 

— ► ■ — v — )- —*■ 

position of the origin of x in (P). Let us denote y as y — Ax. Final- 
ly, let the mapping A be linear. (The example at the beginning of 
Sec. (^satisfies all the requirements enumerated here. The planes ( P ) 
and ( P ) coincide, and / ( M) is understood as the result of rotation 
of the point M through the angle a about the centre of rotation.)’ 

Now let us choose an arbitrary affine coordinate system (see 
Sec. VII. 9) with the coordinates designated as x 2 , with the origin 

of coordinates O and the base vectors p u p 2 . Then the radius-vector 
is represented in the form r = XiPi + x^p z . Let us also choose in 
the plane (P) an arbitrary affine coordinate system y u y 2 with the 
origin 0 and the base vectors q u q 2 . Denote the coordinates of the 
point / ( 0 ) belonging to the plane (P) by b lt b 2 . Let the coordinates 
a point M of the plane (P) be given. What are the coor- 
dinates y u y 2 of the corresponding point / (iff)? We have 

Of (M) = Of (0) + / (0) / (iff) = btfi + & 2 g„ + A (OM) 

and therefore, by above formulas for transformation of the coordi- 
nates of a vector, we obtain 

U\ ^11^1 ^12^2 “1“ &1 

y 2 &2i^i “1" ^22^2 + ^2 


(25) 
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or, in the matrix notation, 

y = Ax + b (26) 

where A = is the matrix of the mapping A relative to 

the chosen bases and b = j . Conversely, we can easily verify 

that if the coordinates of points are transformed according to for- 
mulas (25) the corresponding mapping possesses the properties 
described in the preceding paragraph. The simplest formulas are 
obtained when the origin of coordinates in the plane (P) is carried 
into the origin of coordinates in the plane (P) under the mapping 
in question. We have iq = & 2 = 0 in such a case, and therefore 
formulas (25) turn into 

Ui — a ii X i + 0-l2 X 2 

i/2 = <*21*1 "t" a 22 a: 2 . 

i.e. y = Ax. The coordinates of vectors are also transformed accor- 
ding to formulas (27) in the general case (25). 

If we consider the geometric space or an abstract Cartesian space 
of any dimension (see Sec. VII. 18) we arrive at formulas similar 
to (25) but with different numbers of rows and columns. The matrix 
form of writing will have form (26) again but in the general case the 
rectangular matrix A may not be square since we can have a mapping 
of one space into another when their dimensions are unequal. 

Let us suppose now that the dimensions are equal. For simplicity’s 
sake, let us again consider a mapping of a plane (P) into a plane (P). 
If det A 0 the mapping is called affine. In this case we can mul- 
tiply equality (26) on the left by A -1 which results in x = A -1 y — 
— A -1 b. Consequently, we obtain an equality of the same form (26). 
The inverse mapping of the plane (P) into the plane (P) is therefore 
also affine. 

In Fig. 223 we illustrate the most important types of affine 
mapping of a plane onto itself for which the origin of coordinates 
remains at the same place. Formulas for the transformations of 
coordinates and the corresponding matrices relative to a Cartesian 
coordinate system are also put down in Fig. 223. (Let the reader 
prove the formulas in the third example taking advantage of the for- 
mulas^ = p cos (<p -F a) and = p sin (<p -f a) where p = OM = 
= OM.) Of course, we can also consider different combinations of 
these simple mappings and additional parallel translations as well. 

If det A = 0 then the rank of the matrix is equal either to 1 or 
to 0. As it was shown above, in the first case the plane (P) is mapped 
onto a straight line (in particular, we have such a case when we 
consider the operation of projecting) and in the second case the 
plane goes into a point of the plane (P). 
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By analogy with Sec. II.7, we can readily prove that the order 
of an algebraic curve does not change under an affine mapping. In 
particular, the straight lines being the only curves of the first order 
(see Sec. II.9), an affine mapping transforms straight lines into 



Affine mappings of a plane: 

(a) ft-fold stretching along the x,-axis (b) ft-fold stretching in all directions 
yi = kxt th 0\ yi~kx\ ft 0\ 

y 2 = X2 lo 1/ ya = ft*2 VO ft/ 

(c) Rotation through the angle a 

yi}= xi cos a — *2 sin a /cos cc — sin a \ 

U2 = *1 sin a + x 2 cos a \3insc cos a/ 

(d) Shear along the xj-axis (e) Reflection in the .x,-axis 

Vi = XI + hx 2 /i h\ yi = —Xi / — I 

V2~ *2 V 0 1 / 1/2 = - X 2 V 0 1 / 

straight lines. Since the point of intersection of two lines must be 
carried into the point of intersection of their images under the map- 
ping, intersecting lines go into intersecting lines, and consequently 
parallel straight lines go into parallel lines. On the basis of the de- 
finition we can verify that the ratio of two parallel line segments 
does not change under an affine mapping; at the same time, the 
ratio of non-parallel segments, angles and lengths are changed in 
the general case. Curves of the second order are carried into curves 
of the second order under an affine mapping. An ellipse being the 
only finite curve of the second order, an affine mapping transforms 
an ellipse into an ellipse (or into a circle in a particular case). A para- 
bola is an infinite curve of the second order consisting of one compo- 
nent and it is therefore mapped onto a parabola. Finally, a hyper- 
bola is mapped onto a hyperbola. 

Let us return to linear mappings of general linear spaces (see 
the beginning of this section). Let two such mappings A and B of 
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a space ( R ) into a space (<S) be given. Then we can define the sum of 

v * — v V 

the mappings by means of the formula (A + B) x = Ax + Bx. We 
similarly define the multiplication of a mapping by a number: 

(A. A) x = X (Ax). We readily verify that if some bases in (R) and 

( S ) are chosen and if the above operations are performed on mappings 
then the same operations are performed on the matrices of the map- 
pings. Besides, all the axioms of linear operations hold here. The 
role of the zero mapping is played by the mapping of the whole space 
( R ) into the zero vector of the space ( S ). 

The multiplication of mappings is defined as their successive 
performance. More exactly, suppose we have a mapping B of a space 

(R) into a space ( S ) and a mapping A of the space ( S ) into a space 

(T) . Then AB is understood as a- “composite” mapping of (R) into 

( T ) which is obtained if we first perform the mapping of (R) into 

— ► 

(S) and then perform the mapping of ( S ) into (T), that is (AB) x = 

•— I ► 

= A (Bic). If some bases are chosen in (R), ( S ) and (T) the multipli- 
cation of the mappings yields the multiplication of their matrices 
which accounts for the rule of multiplication of matrices given in 
Sec. 2. The rule is extremely important and we shall therefore illu- 
strate what has been said by taking an example of matrices of the 
second order. Let the mappings B and A be represented by the cor- 
responding formulas 

Pi = + b l2 x 2 \ , zj = flui/i + a l2 y 2 \ 

y 2 = b 2l Xi + b 22 x 2 j z, = a 2l yi -f a 22 y 2 J 

where xj, y h , z ; (J, k, i — 1, 2) are the coordinates in the spaces under 
consideration. Then in order to obtain the “composite” mapping we 
must substitute the first formulas into the second which results in 

z i — a n (bn x i + + <*12 (b 2l x t -j- b z2 x 2 ) — 

— (011&11 "j- 012^2l) x l (011^12 012 ^ 22 ) x 2 ~ Cll^l “l" Cl2^'2> 

Z 2 = 021 (^ii^-i H - b i2 x 2 ) -f- a 22 (b 2l x l -j- b 22 x 2 ) — 

(021^11 "L 022^2l) ®1 T (021^12 022 ^ 22 ) X 2 ~ “L ^22^2 

Thus, we have arrived at the matrix C = ( Cjl Cl2 \ formed accor- 

j. , . . ' c 2l c 22 / 

ding to the rule indicated in Sec. 2 which shows that C = AB. 

In connection with the relationship between the operations on 
mappings and the corresponding operations on their matrices (in 
particular, between the operations of multiplication) we can deduce 
all the properties of the operations on matrices from the properties 
of the operations on mappings because the properties of the latter 
are obviously implied by their linearity. Here we shall indicate 
only the property A (BC) = (AB) G which is implied by the corres- 
ponding property of mappings A (BG) = (AB) G, the latter being 
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justified by tbe fact that both on the left-hand side and on the right- 
hand side we have the mappings which are obtained if we first perform 
the mapping C and then the mappings B and A in succession. When 
we consider the mappings of a space into itself, that is when (S) = 
= (R), the role of the unity in the operation of multiplication is 
played by the identity mapping (unit mapping) under which every 

vector is carried into itself: Ire = x. It is clear that we always have 
AI = IA = A. The matrix of the unit mapping is the unit matrix I 
relative to any basis chosen in the space in question. Similarly, the 
inverse matrix corresponds to the inverse mapping. We saw that the 
multiplication of matrices is non-commutative in the general case. 
This is obviously accounted for by the fact that when we reverse 
the order in which mappings are performed the result can be changed 
considerably. (For instance, let the reader verify that if we first 
apply the mapping shown in Fig. 223 a for k — 2 to the point (0, 1) 
and then apply the mapping c for a — 90° this will result in the 
point (2, 0). But if we reverse the order of the operations we shall 
obtain the point (1, 0).) 

Generally, two given operations, actions, are called commuting 
if the result of their successive application does not depend on the 
order they are performed, and they are called non-commuting if 
otherwise. (Think whether the following two “operations” commute: 
(a) filling a swimming-pool with water; (b) diving into the swim- 
ming-pool.) 

7. Transformation of the Matrix of a Linear Mapping When the 
Basis Is Changed. It has been already noted that when we consider 
a linear mapping A of a space ( R ) into a space ( S ) the matrix of 
the mapping depends on the choice of the bases in both spaces. One 
and the same mapping can have a more complicated matrix relative 
to one basis and a simpler matrix relative to another basis. Let 
us investigate the relationship between the matrix of a mapping 
and the bases we choose. For this purpose let us return, for defini- 
teness, to the example in which we deduced formula (24). Let a new 

basis p j, p', p' be chosen in ( R ). Then any vector x can be resolved 
both with respect to the new basis and to the old one: 

x ~ z iPi + x zP% + z 3 P3 = x'lP'i + x'zP'i + x 3p' 3 (28) 

where x±, x 2 , x s are the old coordinates and x[ ; x'„, x'„ are the new 

coordinates of the vector x. Each of the new base vectors can be 
expanded with respect to the old basis: 
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where hij are some coefficients which define the transformation 
from the old basis to the new basis. Substituting formulas (29) 
into (28) and equalling the coefficients in the same basis vectors 

Pi, p 2 , p 3 we obtain 

xi = h n x j -f- + h l 3 x 3 'i 

X% — Jl^iXi "T /2'22‘2'2 "T" ^23^3 / (30) 

x 3 — h 3 iXi -f- h 3 %x 2 -j - h 33 x 3 J 

(let the reader verify the calculations!). It should be noted that 
a matrix of type 

( hn hi2 h l3 \ 

ll21 h-22 hi 3 I 

h 3Z h 33 J 


that is a transformation matrix from new coordinates to old coordinates 
(from x\, x'„, x' 3 to x % , x 2 , x 3 in our concrete example) must necessa- 
rily be non-degenerate. Indeed, when the coordinates x lt x z , x 3 
are given we must obtain certain completely specified values of 
x' v x', x' 3 and system of equations (30) must therefore he compatible 
and "must have a unique solution. Therefore, det H^O. 

Formulas (30) can be put down by analogy with formulas (23) 
in the abridged form 


x = Hx' 


where x = \ 



Similarly, if we introduce a"new basis q[, q' 2 in the space ( S ) and if 
the transformation matrix from the new coordinates to the old 
ones is K then y = Ky'. Substituting these formulas into (24) 
we obtain Ky'= AHx'. Multiplying the last relation on the left 
by K -1 we deduce 


y' = K” 1 AIlx' i.e. y' = A'x' where A' = K _1 AH 


The matrix A' = K -1 AH is nothing but the matrix of the mapping 
in question relative to the new bases. 

In particular, if ( S ) = ( R ) then K = H and therefore 

A' = H -1 AH (31) 


Let us consider the geometric meaning of the determinant of the 
matrix of an affine mapping of a plane onto itself. Let the mapping 
be defined by formulas (25) and let us first suppose that the basis 

Pi, p 3 taken in the plane ( P ) is a Cartesian basis. According to our 
condition we have ( P ) = ( P ) here. The coordinates of vectors heing 
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transformed according to formulas (27), the vectors p it p 2 will be 

_v -y -V -> -*• -* 

carried into the vectors Si = & 11 P 1 -}- (hiPzi s 2 ~ ®i 2 ?i a nP^ 
respectively (why?). The area of the square constructed on the vec- 
tors pi, p 2 equals unity whereas the area of the parallelogram which 
is the image of the square under the mapping, that is of the paralle- 
logram constructed on the vectors s u s 2 , is equal to | ^ ! = 

— | det A |, according to the end of Sec. VII. 13. Now remark that 
all the parts of a plane are changed in a similar fashion under an 
affine mapping and therefore the areas of all geometric figures change 
proportionally with the same factor of proportionality. Hence, 

| det A | is equal to the factor of proportionality defining the change 
of the areas imder the mapping in question. The sign of det A also 
has a certain geometric meaning. Namely, if the determinant is posi- 
tive the direction of describing the contour of a figure is retained 
under the mapping, that is if we describe the contour of a preimage 
in the positive direction the contour of the image is also described 
in the positive direction. If det A < 0 then the direction of des- 
cribing a contour is replaced by the opposite direction under the 
mapping. (Let the reader verify all the assertions for examples in 
Fig. 223.) 

If now we pass to an arbitrary new basis we shall have 

det A' = det (H -1 AH) = det (H" 1 ) det A det H = 

= (det H) -1 det A det H = det A 

and thus we see that although the matrix of an affine mapping depends 
on the choice of the basis with respect to which it is considered the 
determinant of the matrix is independent of the choice, that is its 
geometric meaning is the same for all possible choices of the basis. 

If we consider an affine mapping of one plane onto another then 
1 det A 1 is also equal to the coefficient of proportionality defining 
the change of the areas if we measure the areas on each of the planes 
relative to the areas of the corresponding parallelograms constructed 
on the basis vectors. 

In the case of an affine mapping of the geometric space onto itself 
we can analogously show that | det A | equals the proportionality 
factor defining the change of the volumes. In this case det A has the 
sign + or — depending on whether the right-handed triads of vectors 
remain right-handed or turn into the left-handed triads under the 
mapping in question. The geometric meaning of the determinant 
of the matrix of an affine mapping of a Cartesian space of any di- 
mension onto itself can be interpreted in a similar way. In particu- 
lar, this meaning immediately implies formula (5) because when 
we perform two affine mappings in succession the corresponding 
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factors of proportionality defining the changes of the volumes are 
mutually multiplied. 

8. The Matrix of a Mapping Relative to the Basis Consisting of 
Its Eigenvectors. Let us consider a linear mapping A of a linear space 

( R ) into itself. If a nonzero vector x goes into a parallel vector under 
the mapping, that is if the mapping of the vector x reduces to the 

multiplication by a scalar (Ax = Xx ) then x is called an eigenvector 
of the mapping A corresponding to the eigenvalue X. 

For instance, any vector parallel to the x r axis is an eigenvector 
corresponding to the eigenvalue k of the mapping shown in the 
first example in Fig. 223. In this example any vector parallel to 
the x 2 -axis is also an eigenvector; it corresponds to the eigenvalue 1, 
that is it does not change under the mapping. Let the reader find 
the eigenvectors and the eigenvalues for the other examples in 
Fig. 223. 

If we have chosen a basis in (R) we can consider the number vector 

* — 

x consisting of the coordinates of a vector x instead of the vector x 
itself. Then, according to Sec. 6, the equality defining an eigenvec- 
tor acquires the form Ax = Xx, i.e. form (11). Hence, by Sec. 4, 
the vector x must be an eigenvector of the matrix A of the mapping 
in question relative to the chosen basis. In Sec. 4 we established 
the method of finding these vectors. On the basis of Sec. 4, we con- 
clude that the number of the eigenvalues of the mapping A coin- 
cides with the dimension of the space (R) but there can be imaginary 
values and coinciding values among them. For instance, in the 
example (c) in Fig. 223 all the eigenvalues are imaginary (check 
it up!) and therefore none of the nonzero vectors remains parallel 
to its original direction under the mapping. (By the way, Sec. VIII.8 
implies that if the dimension of (R) is odd then equation (13) posses- 
ses at least one real root and there is therefore at least one eigen- 
vector.) 

For definiteness, let us suppose that the space (R) is three-dimen- 
sional. Besides, let us suppose that there are three linearly inde- 
pendent (real) eigenvectors of the mapping A (let these vectors be 

h, Z 2 , l 3 and the eigenvalues be Xy, X 2 , X 3 ). Let us take these vectors 
as a basis in (R). It turns out that in such a case the matrix A 
acquires a form which is especially simple. Indeed, let us write 

Pl — ffljjX, -j- U i2 X 2 -f- X i3 X 3 

~ ® 21*1 ® 22‘ C 2 + ® 23*3 ( ( 32 ) 

1/3 = ®3i a 'l “f” ^32^2 a 33 2 '3 ' 

where the numbers (j, k — 1, 2, 3) are the elements of the matrix 
relative to the basis which are yet unknown and x\, x', x' are the 

coordinates in this basis. The vector Z t has the coordinates (projec- 
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tions) x\ = 1, < = 0, x 3 = 0. After the mapping it goes into the 

vector with the coordinates y[ = ku y 2 ~ #3 = must 

therefore have 

%i = fljjl + a u 0 + a iS 0, 

0 = a\ jl + flj S 0 -{- a 5s^> 

0 = flail + fljjO + fflggO 

from which we find a' n = Xj, a 21 = 0 and a 31 == 0. We similarly 
obtain a' 2 — ta, a 33 = X 3 , a' I2 = fli 3 = fl 23 “ a 32 ~ ® (check l 
up!). Therefore formulas (32) in fact have the form 

y i = y 2 — ^'2 a ' 2 j i/a ^8*^3 

Consequently, tlie matrix of a linear mapping relative to the basis 
consisting of its eigenvectors has the diagonal form, 


Ai 0 0 \ 

A' = | 0 0 J = diag (Aj, A. 2 , ^ 3 ) 

\0 0 


By Sec. 4, for such a basis to exist, it is sufficient that all the 
roots of the characteristic equation of the matrix A be real and 
distinct. Any matrix can be regarded as the matrix of a linear map- 
ping, and the matrix of a mapping is transformed according to for- 
mula (31) when the basis is changed. Hence, the above result can 
also be formulated as follows: if all the roots of the characteristic 
equation of a square matrix A are real and distinct it is possible to 
find a non-degenerate matrix H such that the matrix H -1 AH will he 
diagonal and the diagonal elements will be equal to the roots. 

If the characteristic equation of a matrix has an imaginary root 
we can find the corresponding eigenvector (number vector) by sol- 
ving equations (12), and the coordinates of the vector will also be 
imaginary. Such an eigenvector has no geometric meaning. For 
instance, this is the case for the third example in Fig. 223. But 
of course we can use such number vectors without considering their 
geometric meaning. If we admit complex values of the projections 
in question then all the calculations remain true. In particular, the 
assertion of the preceding paragraph will remain true for any square 
matrix whose all eigenvalues (i.e. the roots of the characteristic 
equation) are distinct. But of course in the general case the matrix 
H may be complex. 

If the characteristic equation of a matrix has multiple roots such 
a matrix cannot be reduced to the diagonal form in the general 
case. In particular, this is the case for the fourth example in Fig. 223. 

In conclusion, let us note that although a matrix A is transformed 
according to formula (31) when the basis is changed its characteristic 
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equation does not change. Indeed, we have 

det (A' — XI) = det (H _I AH — XI) = det [H _i (A — XI) II] = 

= det (H _1 ) de t (A — XI) de t H = det (A — XI) det H = det (A — XI) 

(33) 

which is what we set out to prove. 

9. Transforming Cartesian Basis. In this section we shall suppose 
that ( R ) is not only a linear space but also a Euclidean space (see 
Secs. VII. 20-21). Thus we can speak about Cartesian (Euclidean) 
bases in ( R ). Let us investigate the properties of a matrix which 
defines the transformation from a Cartesian basis to another Cartesian 
basis. For definiteness, let us consider the space (R) to be three- 
dimensional. In order to investigate the transformation we shall 

■— ► — +• — f 

use formulas (29). Let p { , p 2 , p 3 form a Cartesian basis, that is let 

them play the same role as vectors i, j, k in § VII. 3. For p[, p '„ , p % 
also to form a Cartesian basis it is necessary and sufficient that there 
should be 


PWi = Pt'P* = ZV/>3 = 1 a nd Pi'P* = JVPs = PrP'» = 0 


(why?). Putting down the scalar products in full according to formu 
la (VI 1.12) we obtain 

Xu -f Ki + h" n = h \ 2 + h \ 2 -f- Xf 2 = h \ 3 + h \ 3 + h \ 3 = 1, 
^•uhi2 + X 21 X 22 -f- h 3 l h 32 = h lz h i3 -f- X 22 X 2 3 -j- h 32 h 33 — 

= hnh i3 h 2 l h 23 -f- h 3 l h 33 — 0 

These six equalities can be put down in tbe matrix form 

/X it X 2 i h 3l \ /fin hi 2 ki 3 \ /I 0 0\ 

| Xj2 h z 2 h 32 j.j h z 1 X22 h 3 3 J = | 0 1 0 I 

V Xj 3 h Z3 X33/ \h 3 1 X32 X33/ \0 0 1 J 


(Let the reader check up the last relation by multiplying the matrices 
entering into the left-hand side according to the rules of Sec. 2 
and by comparing the result with the right-hand side.) Using the 
notation introduced in Secs. 1 and 3 we can rewrite the above con- 
dition in the form 

H*H = I, i.e. H* = H" 1 (34) 


A matrix satisfying these equalities, that is a matrix which is equal 
to its transpose, is called orthogonal. Hence, the property proved 
above can be stated as follows: the matrix of a transformation from 
one Cartesian coordinate system to another Cartesian system is 
orthogonal. We can similarly show that, conversely, if the trans- 
formation matrix is orthogonal then any Cartesian basis is necessa- 
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rily transformed into a Cartesian basis. [Verify that the matrices 
of transformations (II. 3) and (II.4) are orthogonal.] 

Orthogonal matrices are also encountered in connection with 
mappings. Namely, a linear mapping of a Euclidean space into 
itself is called orthogonal if the lengths of the vectors are not changed 
under the mapping (such mappings are also called isometric). Any 
triangle being carried into an equal triangle under such a mapping 
(according to the well-known test for the equality of triangles), we 
see that all the angles are retained in this case. An orthogonal map- 
ping can either be a motion of the space as a whole or a combination 
of a motion and a reflection (in a hyperplane). 

For instance, the third and the fifth mappings among those shown 
in Fig. 223 are orthogonal (the former defines a motion and the 
latter a reflection). 

By analogy with formulas (22) and by arguments similar to those 
at the beginning of this section we can easily verify that the matrix 
A of an orthogonal mapping relative to a Cartesian basis is an ortho- 
gonal matrix. Conversely, it a mapping has an orthogonal matrix 
relative to a Cartesian basis the mapping is orthogonal. 

Equality A*A = I [see formula (34)] implies that 

det A* -det A = (det A) 2 = dot 1 = 1 

from which it follows that det A = ±1. This is also implied by 
the geometric meaning of the determinant of the matrix of a linear 
mapping (see Sec. 7). The determinant equals 1 for a motion and 
— 1 for a reflection or for a combination of a reflection and a motion. 

Let us note a consequence which is used In mechanics. Let there 
be given a motion of the geometric space for which the origin oi 
a coordinate system remains at the same place. Such a motion can 
he regarded as an orthogonal mapping A of the totality of all the 
vectors of a three-dimensional space. If we factor the left-hand side 
of the characteristic equation according to formula (VIII. 25) in the 
form 

det (A - XI) = -(X — X,) (X — Xj) (X — X 3 ) 

and then put X = 0 in this identity we receive XiX 2 X 3 = 1. It follows 
that at least one of the eigenvalues X& is real and positive. Hence. 

there exists a vector x 0 ¥= 0 for which Ax 0 *= X^a: 0 . But the lengths 
being retained, we must have X& = 1. Consequently, the motion 
in question is a rotation about an axis passing through the origin 
of coordinates and parallel to an eigenvector corresponding to an 
eigenvalue 1. 

10. Symmetric Matrices. Here we shall indicate the application 
of the above results to investigating symmetric matrices (see Sec. 1). 

It is possible to prove the following properties of the symmetric 
matrices. 


23-0141 
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All the eigenvalues of a symmetric matrix are real. 

For example, a symmetric matrix of the second order is of the 
form 



and its characteristic equation is 


or 

1? — (a + c) X-\-ac — Z> 2 = 0 (36) 

The roots of the equation are 

+ (37) 

and they are obviously real. Here we shall not give the proof of 
this assertion and of the two following assertions in the general case 
for matrices of order higher than the second. 

Eigenvectors of a symmetric matrix corresponding to different eigen- 
values are necessarily orthogonal to each other. 

As an example, let us take matrix (35) for b 0. The coordinates 
of an eigenvector are found from system (12) which has the form 

(a — A) aq -f- bx z = 01 
bx t -}- (a — X) x z — 0 J 

in our case. If X is an eigenvalue these two equations are dependent 
(see Sec. VI. 6) because the determinant 

a — X b 

b c — X 


equals zero. Hence, we can limit ourselves to solving only one of 
the equations. For definiteness, let us take the first equation. In 
order to satisfy the equation we can put aq = — b and x z = a — X. 
Putting X = ?<. t and X = A 2 we thus obtain two eigenvectors of the 
form 


—b \ ( -b 

[a — Xi) and (fl — A 2/ 

The scalar product of the vectors is equal to 

b 2 -J- (u — Xf) (a — Xf) — A t A 2 — a (Xi X z ) + b 2 -j- a" 
— ac — b 2 — a (a + c) + b 2 + a 2 = 0 


(38) 


[in deducing the result we have taken the advantage of the well- 
known formulas for the sum and for the product of the roots of 


MATRICES AND THEIR APPLICATIONS 


355 


quadratic equation (36)]. This implies the perpendicularity of vec- 
tors (38). If we put b = 0 the vectors ( J ) and ( J ) can serve 

as eigenvectors (verify this!) and thus we see that in this case they 
are also mutually perpendicular. 

If not all the roots of the characteristic equation are distinct, 
that is if there is at least one multiple root (let the multiplicity of 
the root be k), it is possible to find k mutually orthogonal eigenvec- 
tors corresponding to this eigenvalue. 

For example, formula (37) implies that matrix (35) possesses a 
double eigenvalue if and only if a — c and 6 = 0. But then all 
the vectors are eigenvectors (check it up!) and thus we can choose 
two mutually perpendicular vectors among them. 

The above properties imply that if a given symmetric matrix A 
is regarded as the matrix of a linear mapping relative to a Cartesian 
basis then we can always find a new Cartesian basis which entirely 
consists of eigenvectors of the matrix A. For instance, in the three- 
dimensional case the characteristic equation is of degree three 
and thus it has three roots which, as it has been indicated, 
will be real. If these roots are distinct from each other the corres- 
ponding eigenvectors are mutually perpendicular. We Can choose 
these vectors so that they should be of unit length, and then they 
can be taken as the sought-for basis. If = A 2 X 3 we can choose 
two mutually perpendicular eigenvectors corresponding to the 
eigenvalue X l5 and the vector corresponding to the eigenvalue h 3 
will be perpendicular to both vectors. Finally, if all the three eigen- 
values are the same we can indicate three mutually perpendicular 
eigenvectors corresponding to the eigenvalue. 

The transformation from one Cartesian basis to another is per- 
formed by means of an orthogonal matrix (see Sec. 9). Besides, if 
the latter basis consists of eigenvectors of a matrix it takes the 
diagonal form after being transformed according to formula (31) 
(see Sec. 8). Consequently, the property proved in the preceding 
paragraph can be formulated in terms of matrices as follows: for 
any symmetric matrix A, it is possible to find an orthogonal matrix 
H such that the matrix H -1 AH is diagonal and the diago- 
nal elements are equal to the eigenvalues of the matrix A. 

§ 3. Quadratic Forms 

11. Quadratic Forms. A quadratic form in several variables is 
a homogeneous polynomial of the second degree in these variables. 
For example, a quadratic form in the three variables x u .t, has 
the general form 

F = -f a zz x\ + a 3z x\ -f 2 a lz x l x 2 -f 2a 13 x i x 3 + la 23 x< ! x 3 (39) 

23 * 
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where a 1(1 a z 21 - - <123 are numerical coefficients [some of the 

coefficients are doubled in (39) to simplify further formulas]. The 
symmetric matrix 


a ll 

a i 2 

0\ 3 

a l 2 

a 22 

a 23 

®l3 

0-23 

O 33 , 


is called the matrix of the quadratic form. With the help of the 
matrix we can rewrite formula (39) as 



= (Ou x i + a l2 X 2 + a l3 X 3 ) x i + ( a l2 x l *f" a 22 x 'i + ^ 23 ^ 3 ) X 2 + 

+ (Ci3^i + fl 23^2 + a 3 S x 3 ) x 3 — yiXi+ y z x 2 -f y 3 x 3 = 

/ yi ' 

= { x i x 2 y 3 ) I y% 

\y 3 i 


Thus, 


( ^11^1 -f" @13X0 -f- (2{ 3 x 3 \ / a ti a lz a i3 

a 12 x i~h&22 x 2~t~ a Z3 x 3 I — I a 22 a 23 

a l3 X l-\- a 23 x 2~\~ a 33 x 3 / \ a l3 a 23 a 33 


if we introduce the number vector x 
F = x*Ax 




= A 


we obtain 


(40) 


Conversely, if a form is represented as (40) and if the matrix A 
is symmetric then A is the matrix of this quadratic form. 

Let us now perform an arbitrary linear transformation of the 
variables of form (30). The transformation can be put down in the 
matrix form as 

x = Hx' (41) 

Then, by formula (4), we have x* = x'*H* which implies 
F = x'*H*AHx' = x'* (H*AH) x' 

that is 

F = x'*A'x' where A' = H*AH (42) 

But the matrix A' is symmetric because, by formula (4), we have 
A'* = (H*AH)* = H*A*H** = H*AH = A' 

Hence, it is A' that is the matrix of the quadratic form after the 
change of variables is made. 

Thus, substitution (41) yields the transformation of the matrix 
of a quadratic form according to formula (42). In particular, if H 
is- an orthogonal matrix then, by formula (34), we see that A' = 
= H -1 AH. As it has been shown (see Sec. 10), we can always choose 
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a matrix H such that there should -he A' = diag (X l5 ^ X 2 , X 3 ) where 
the diagonal elements are the eigenvalues of the matrix A. But then 
the quadratic form will acquire the diagonal form 

F = Xj:r [ 2 "i" hs x 3~ (^3) 

in the new variables. Consequently, any quadratic form (39) can he 
reduced to diagonal form (43) (where X 4 , X 2 , X 3 are the eigenvalues 
of the matrix A) by means of transformation (30) with an orthogo- 
nal matrix H. 

The above formal transformation has the following geometric 
meaning. Let us regard A as the matrix of a linear mapping A rela- 
tive to a Cartesian basis with the coordinates x u x 2 , x 3 . Then trans- 
formation (41) reducing the quadratic form F to form (43) corres- 
ponds to the transformation to a new basis consisting of eigenvectors 
of the mapping A. 

In Sec. 8 [see formula (33)] we showed that any transformation 
of form (31) does not change the determinant det (A — XI). Hence, 
if we expand the determinant in powers of X the coefficients in the 
powers will not change; they will be invariant with respect to any 
transformation of a Cartesian coordinate system to another one. 
For instance, a quadratic form in two variables is expressed by the 
formula 


Ax 2 + 2 Bxy + Cy 2 

(we have put down the formula using the notation applied in analy- 
tic geometry), that is its matrix is of the form 

A B 
B C 


and its characteristic equation is written as 


A — X 
B 


B 

C— X 


= X 2 — (A A- C)X + AC — 5 2 = 0 


Hence, the expressions A + C and AC — B 2 are invariant with 
respect to any change of Cartesian coordinates (see Sec. 11.13). 

12 . Simplification of Equations of Second-Order Curves and 
Surfaces. The transformation of a quadratic form described in 
Sec. 11 is applied, in particular, to simplifying equations of curves 
and surfaces of the second order. Let us dwell on equations of sur- 
faces since the problem of simplifying equations of curves of the 
second order was considered in Sec. 11.13. 

Let the equation of a second-order surface be represented in ordi- 
nary form (X. 13) used in analytic geometry. The transformation 
to a new Cartesian coordinate system having the same origin is 
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reduced to a change of variables of the form 

x = h n x' + h l2 y' + h ls z' 1 

y — h 2 lX + > (44) 

z = /i 31 a:' + h 32 y' + h 33 z' J 

as it was shown in Sec. 9, where H = ( h t j ) is an orthogonal trans- 
formation matrix. (It is evident that if the origin of coordinates is 
left unchanged the coordinates of points and the coordinates of 
vectors are transformed according to the same formulas.) Substi- 
tuting these expressions into equation (X.13) we see that the groups 
of summands containing the terms of the first degree and of the 
second degree are transformed independently. Let us consider the 
transformation of the group of the second-order terms which is 
a quadratic form. On the basis of Sec. 9, we conclude that we can 
always choose a coordinate system x' , y', z' so that this group of 
terms should acquire the diagonal form 

X lX ' 2 + k 2 y' 2 + k 3 z' z 

Hence, the whole equation (X.13) will have the form 

k lX '- + X 2 y'* + k 3 z'* + G'x + H'y' + I'z' + / = 0 (45) 

where A t , k 2 , A 3 are the roots of the equation 


A — k B 

B C — k 
D E 


D 

E 


= 0 


F — k 


and G', H ' , /' are some new coefficients in the terms of the first 
degree which occur after substitution (44) has been made. Equation 
(45) is nothing but equation (X.14) put down in the different nota- 
tion. It was investigated in Sec. X.ll. 


§ 4. Non-Linear Mappings 

13. General Notions. Let us begin with a mapping of a plane 
into a plane. Suppose that there are two planes ( P ) and (P) (by the 
way, the planes may coincide). Let to each point M of the plane 
( P ) (or to each point M taken from a domain in the plane) there 
correspond a point M of the plane ( P ), according to a certain law. 
Then we say that we are given a mapping of the plane (P) (or of 
its domain) into the plane (P). For the mappings of a specific class, 
curves go into curves and geometric figures into geometric figu- 
res under a mapping of the plane (P) into the plane (P) although 
the form of a geometric figure may change considerably (see Sec. 
VIII. 11). Such a mapping is depicted in Fig. 224, and we see that 
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wing a preimage in the plane ( P ) it is difficult to - recognize its 
ge in the plane (P) and vice versa. There are also the cases of 
meration when some geometric figures are “contracted” into curves 
even into points. 

is sometimes necessary to consider the inverse _mapping which 
be obtained if we arbitrarily choose images in ( P ) and find the 
esponding preimages in (P). As in Sec. 1.21, where we conside- 
inverse functions, it can happen that we encounter a difficulty, 
lely the fact that the inverse of a single-valued mapping may 



be single-valued. This will be the case if two distinct points in 
plane (P) (for instance, such points as the points U and V in 
. 224) are carried into the same point of the plane ( P ). Such a 
at will have at least two preimages. 

f not only the mapping in question but also its inverse mapping 
single-valued we say that . there is a one-to-one mapping. If ' 
mapping under consideration is not one-to-one but does 
degenerate the plane (P) can be broken into parts such 
t the mapping is one-to-one in each of the parts, 
lappings can be described anaemically by means of coordinate 
terns. To do this suppose that there is a Cartesian coordinate 
tern x, y in the plane (P) and a system x, y in the plane (P). 
?se systems may also be coincident. Then if we set the coordinates 
'J °f a point M the coordinates x, y of the corresponding point M 
1 be completely specified. In other words, the mapping is defined 
some relationships of the form 

x = x{x, y), y = y (x, y) (46) 

len considering the inverse mapping we set the values of ~x and 
n these, formulas and find x and y. For the mapping to be one- 
one. it is necessary that there should be no more than one solution 
y of equations (46) for any given x and y. 
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Similarly, a mapping of a three-dimensional space into another 
three-dimensional space is defined by equations of the form 

x — x(x, y, z), y = y (x, y, z), z = z ( x , y, z) (47) 
in place of (46). 

We can also consider mappings for spaces of different dimensions. 
For instance, the formulas 

X = X (x, y), y = y (x, y), z = z {x, y) 

define a mapping of a plane into a three-dimensional space. 

Formulas (X.5) can also be regarded as formulas defining a map- 
ping of an m-dimensional space with the coordinates t u t^, . . ., t m 
into an 77-dimensional space with the coordinates x s , x 2 , . . ., x„. 
Of course, the coordinate systems may not be Cartesian in the gene- 
ral case. 

14. Non-Linear Mapping in the Small. Let us consider a mapping 
defined by formulas (46) in the vicinity of a point M 0 ( x 0 , y 0 ) which 
is mapped into a point M 0 {x 0 , J/o)- The increment of any function 
being close to its differential to within the terms of higher order 
of smallness (see Sec. IX. 11), we can neglect these terms and put down 


Ai =(f-), 4t +(w)» Ai ' 

A M $).**+ (4). A » 

Here Ax = x — x 0 , Ax = x — x 0 etc. (we can say that these are 
Cartesian coordinates reckoned from Mo and Mo, respectively) and 
the index “zero” indicates that the derivatives are taken at the point 
Mo- Comparing these formulas with formulas (27) we conclude that 
a non-linear mapping can be regarded as a linear mapping in an 
infinitesimal neighbourhood of any point with an accuracy of infi- 
nitesimals of higher order of smallness. 

On the basis of Sec. 6, we conclude that if the determinant 



dx 

dx 

dx 

dy 

dy 

dy 

dx 

ay 


D { x , y) 
D ( x , y) 


(49) 


(see the notation in Sec. IX. 13) is unequal to zero at a point Mo 
the mapping in question is one-to-one in an infinitesimal neighbour- 
hood of the point. Moreover, it can even be regarded as affine with 
an accuracy of infinitesimals of higher order. Besides, the absolute 
value of the determinant is equal to the proportionality factor 
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defining the change of the areas of infinitesimal geometric figures 
(placed infinitely close to the point) under the mapping. The coef- 
ficient is no longer constant in the whole plane as it was in the case 
of a linear mapping because the determinant takes on different 
values at different points in the general case. In particular, the 
meaning of the determinant makes it possible to attribute a certain 
geometric meaning separately_to the denominator and to the nume- 
rator of the expression Namely, we can consider them as 

being equal to the areas of an infinitesimal figure before the mapping 
and after the mapping, respectively. 

If Jacobian (49) vanishes at a point then the mapping in question 
degenerates atthepoint, namely, the area of an infinitesimal geometric 
figure becomes an infinitesimal of higher order of smallness after 
the mapping has been performed. Finally, if Jacobian (49) is iden- 
tically equal to zero the mapping degenerates throughout the whole 
plane which leads to a reduction of the dimension: the plane can 
be mapped into a line (not necessarily into a straight line) or even 
into a point. 

One must not think that in case Jacobian (49) does not turn into 
zero at all the points of a finite domain the mapping in question 
will be one-to-one in the domain. A mapping can be non-degenerate 
at all the points and nevertheless it may not be one-to-one (see 
Fig. 225). 

A mapping of a three-dimensional space into another three-dimen- 
sional space possesses similar properties. Such a mapping can be 

defined by formulas (47). Here the value of the Jacobian 

is also essential. Its absolute value is equal to the factor of propor- 
tionality defining the changes of the volumes of infinitely small 
solids. (What is the geometric meaning of the sign of the Jacobian?) 

If the Jacobian is identically equal to zero we can pose 
the problem of determining the “degree” of the degeneration, that 
is whether the space x, y, z will be mapped onto a surface or onto 
a curve (or even into a point) of the space x , y, z. The answer to the 
question is implied by the considerations given in Sec. 6 and by the 
fact that every mapping is linear in the small (to within infinitesimals 
of higher order). Thus, we must investigate the matrix 


{ dx 

dx 

dx 

dx 

~oy 

Hz 


d~J 

dy 

dx 

dy 

dz 

1 

dz 

d~z 

\ dx 

d'J 

dz 


(50) 
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In the case of degeneration its rank is less than three at any point 
(why?). If it is equal to 2 everywhere (except for some points at 
which the rank can be reduced still lower) then the x, y, z-space is 
mapped onto a two-dimensional surface. If the rank does not exceed 
unity at any point but does not vanish identically then the space 
is mapped onto a one-dimensional curve. Finally, if it is identically 


y y 



equal to zero [this means that all the elements of matrix (50) are iden- 
tically equal to zero] the whole _space will be mapped into a point 
because this can he only if x, y, z are identically constant. 

A similar situation occurs when we consider mappings of spaces 
of arbitrary dimensions which may be different in the general case. 
As it has already been said, formulas (X.5) can be regarded as for- 
mulas defining a mapping of an m-dimensional space with the coor- 
dinates t u t 2 t m into an n-dimensional space with the coor- 

dinates Xi, x z , . . ., x n . To determine the dimension of the manifold 
which appears as a result of the mapping we must compose an 
(n X m) matrix of the form 


/ dx t 

dxi 

dx{ 

dh 

dt 2 

dt m 

dx 2 

dx z 

dx 2 

dti 

dt z 

dt m 

dx n 

dx n 

fan 

\ dti 

dt 2 ' ' ' 

dt m 


If the rank of the matrix equals k (we admit a further reduction 
of the rank at some points) then the dimension is equal to k. 

15. Functional Relation Between Functions. The results of Sec. 14 
can be applied to the notion of “functionally dependent” systems 
of functions. Let us first suppose that we are given three functions 
of three independent variables: 

F i ip'i 1/ 1 F), v F g (Xj y, z), F 2 ( x , y 7 z) 


(51) 
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We say that the functions are dependent on each other if there is 
a relation of the form 

<D (Fi (x, y, z), Ft (x, y, z), F z (x, y, z)) = 0 (52) 

connecting the functions where O is a function of three variables 
of the form ® = CD (X, |i, v) which is not identically equal to zero. 
If otherwise, we say that the functions are independent. We can solve 
relation (52) for one of the variables F u F z or F 3 , and therefore 
we can also say that functions (51) are dependent if one of them is 
expressible as a function of the others. 

For example, the functions 

(-^r) 2 , In (x + z) and x — y (53) 

are dependent because if we denote them as F t , F 2 , F 3 we have 
F,e 2Fn - - FI = 0. 

To establish a test for the existence of a functional relation bet- 
ween functions in the general case let us consider an auxiliary map- 
ping of the form 

X = F i (x, y, z)'! » 

P = F 2 (x, y, z) > (54) 

v = F 3 (x, y, z)J 

For functions (51) to be dependent, it is necessary that relation 
(52), that is the relation Q (A,, p, v) = 0, should be true for all x, 
y, z. The relation defines a surface in the K, p, v-space. Therefore, 
we see that for functions (51) to be dependent, it is necessary that 
the x, y , z-space be carried into a surface in the X, p, v-space under 
mapping (54). This means that the mapping should be degenerate. 
Applying the result of Sec. 14 we arrive at the condition 


D(Fu F s , f 3 ) _ n 
D[x,y,z) ~ U 


(55) 


which is necessary and sufficient for functions (51) to he dependent. 
[Let the reader check up the fulfilment of condition (55) for func- 
tions (53).] 

If the rank of the matrix 


dFi 

Mi 

dFi 

dx 

dy 

dz 

dF 2 

dF» 

dF 2 

dx 

dy 

dz 

dF z 

8F 3 

dF 3 

, dx 

dy 

dz 


equals unity the x, y, z-space, as it was shown in Sec. 14, is carried 
into a curve in the X, p, v-space. Taking advantage of equations 
VA.Z) we then conclude that in this case the functions F it F z , F 3 
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are connected with each other by two independent relations of 
form (52). 

A similar result is also obtained in the case when the number of 
independent variables differs from the number of functions. For 
instance, two functions of three variables of the form ( x , y, z) 
and F 2 (x, lh z) will be dependent if the mapping 

% = Ft ( x , y, z), ju = F 2 (x, y, z) 

will transform the x, y, z-space into a curve with an equation of 
the form © {X, p) = 0 in the X, p-plane. According to Sec. 14, 
the condition guaranteeing the above property is that the rank of 
the matrix 


■dFt 

dF { 

SFf 

dx 

Oy 

dz 

dF i 

3F~ 

OF, 

> dx 

Oy 

dz . 


should be less than two, that is there should be equalities of the form 


dFi 

dFi 


dFi 

dFi 


dFi 

dF % 

dx 

Oy 

®s0. 

dx 

dz 

= 0 and 

Oy 

dz 

0F Z 

dF z 

OF, 

OF, 

of 2 

0F 2 

dx 

oy 


dx 

dz 


Oy 

~oF 


The condition for an arbitrary number of functions of any number 
of arguments to be functionally related is put down in a similar 
form. It should be noted that in case the number of functions exceeds 
the number of independent variables the functions are always de- 
pendent. 



CHAPTER XII 


Applications of Partial 
Derivatives 


§ 1. Scalar Field 

1. Directional Derivative. Gradient. Let a Cartesian coordinate 
system x, y, z in space be given. Then, according to Sec. IX. 9, a 
stationary scalar field can be regarded as a function u == u ( x , y , z). 
(Wien investigating a non-stationary field we can apply the same 
point of view at any fixed moment of time.) Besides, let a point M 
in space also be given. Suppose that a curve 
(L) starts from the point M in the direc- 
tion l (see Fig. 226). Then the rate of change 
of the field in this direction (related to unit 
length) is called the derivative of u along the 
direction l : 

(i) 

01 As ->-0 As 

To compute the directional derivative let 226 

us suppose that the curve (L) is represented 

in parametric form by the equation r = r (s) where the parameter s 
is the arc length reckoned along ( L ) (see Sec. VII.23). Then the 
values of u taken along ( L ) form a composite function of the arc 
length: u ( 's ' ) =u (x ( s ), y (s), z (s)). The sought-for derivative is then 

nothing but the derivative . Therefore, by the rule of differen- 
tiating a composite function (IX. 11), we have 



du du dx .du dy , du. dz 

dl dx ds ' dy ds ' dz ds 


The right-hand side can be represented as a scalar product of two 
vectors [see formula (VII. 12)]: 
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The first of the vectors is called the gradient of the field (function) 
u. It is designated as 

, du . . du . , du , /n . 

grad u = 1 -f- -gjj- j -f- k (2) 

The meaning of this vector will be discussed a little later. The 
second vector 


dx • , 




yj-j-zk) dr 


= T 


ds J ds ds ds 

is the unit vector in the direction l (see Sec. VII.23). Thus, 


du 

In 


= grad u ■ x 


( 3 ) 


The first factor entering into the right-hand side depends only 
on the choice of the point M. The second factor depends only on 

the choice of the direction Z. In particular, we see that ~ is in- 
dependent ofthe choice of a concrete curve ( L ) among all the possible 
curves passing through M in the given direction Z. (By the way, 

it should be noted that the derivative ~ will no longer be inde- 
pendent of the choice.) 

According to formula (VI 1. 5), we deduce from (3) the expression 
-§f = P r °i i (grad u) = grad, u (4) 


(grad; u designates the projection of the gradient on the axis pas- 
sing in the direction Z). 

Note that the derivatives u' x , u' y and u' z are also directional deri- 
vatives: for instance, u' x is the derivative in the direction of the 
ar-axis. 

Let us put down one more useful formula containing the gradient 
which is based on definition (IX. 7) of the total differential: 

du — - dx + dy -f- —■ - dz — 

ox dy 3 dz 

= (lr i +l^j+‘^' k )’( da:i +^j+ d2k ) =: 

= grad u -d(x i -f- yj -f- zk) = grad u • dr 

Let a field u and a point M he given. Let us set the following pro- 
blem: in what direction Z is the derivative maximal? We see that 

oL 

on the basis of formula (4) the problem reduces to the following 
question: in w’hat direction is the projection of the vector grad u 
maximal? Evidently, the maximal projection of any vector is ob- 
tained when we take its own direction, the maximal projection 
being equal to the modulus of the vector. 
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Tlius, tlie vector grad u at a point M indicates the direction of 
the maximal rate of increase of the field (function) u, this maximal 
rate (related to unit length) being equal to | grad u J. The faster 
the change of the field, the greater the modulus. See Fig.. 227 where 
the outer circle bounds a part of a heat conducting medium which 
is being cooled from outside and which is being heated from the 
internal region (shaded). The arrows represent the vector field of 
gradients of the scalar temperature field in question. We see that 
the gradient of the temperature is directed “toward the stove". 

The physical meaning of the gradient implies that the relation- 
ship between a scalar field and its gradient is invariant, that is it 
remains the same when an original 
Cartesian coordinate system is replaced 
by another because the rate and the 
direction of maximal increase of a field 
are independent of the choice of a coor- 
dinate system. [By the way. the origi- 
nal definition (2) of the gradient which 
is connected with a particular choice of 
a Cartesian coordinate system does not 
directly imply the invariance.] Moreo- 
ver. if we are given a field u we can find 
the direction and the rate of maximal 
increase of the field n at every point in 
space and hence we can find the vector 
grad u without using coordinates and 
without representing the field as a function u (a;, y, z). Thus, 
vectors grad u form a completely specified vector field of gradients 
corresponding to a given scalar field. 

Analogous conditions of invariance are set for all basic notions 
of the theory of vector field which we will not study in this chapter. 
The matter is that when we change an original Cartesian coordinate 
system the projections of vectors change although the vectors them- 
selves remain invariant. Therefore, if a concept related to the theory 
of vector field is formulated in terms of coordinates or projections 
of a field we must additionally verify whether the concept satisfies 
the condition of invariance with respect to the changes of coordi- 
nates and projections when the coordinate axes are rotated. 

Let us illustrate the application of' the concept of gradient to 
the problem of computing the rate of change of a scalar field along 
a trajectory. Suppose we have a field u which may be non-statio- 
nary in the general case, that is u — u (z, y, s, t). Besides, let. 
a certain law of motion of a particle M be given in the form r — r ( t ). 
If we consider the value of u at M in the process of motion then the 
value becomes a composite function of time: u = u{x (t), y (t), 
z\t), t). To compute the sough t-for rate of change of the value we 
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can apply transformations similar to the ones given above. This 
leads to the so-called total derivative 


du , 

= grad w-v 


, du du du 

T ~dt~"d 7 V ‘ It 


where v — ~ is the velocity vector of the particle in the process 

of its motion and ~ is the derivative of u in the direction of the 
tangent to the trajectory. 

In case the field is stationary, that is if — == 0, we have only 

the first summand on the right-hand side. Hence, this summand repre- 
sents the rate of change of the field which is due only to the transi- 
tion of the point M from one value of u to another along the trajec- 
tory. For instance, if u is temperature such a summand describes 
the changes of the temperature which are due to the transition of 
the point M from one region in space to another region with diffe- 
rent temperature and the like. This is the so-called convective ve- 
locity. The second summand represents the rate of change at a 
motionless point (coinciding with the current position of the moving 
point M at a certain moment of time) which is due to the non-sta- 
tionarity of the field. This is the local velocity. In the general case 
we have both factors which add together and yield the resultant 
rate of change of the field along the trajectory which is the sum of 
the convective velocity and the local velocity. 

2. Level Surfaces. Level surfaces of a field u (x, y, z) (see Sec. IX.7) 
are the surfaces on which the field assumes constant values, 
that is the surfaces represented by equations of the form u (x, y, z ) = 
= const. Depending on the physical meaning of the field in ques- 
tion these surfaces may be called isothermic surfaces, isobaric sur- 
faces and the like. There is a simple relationship between these 
surfaces and the gradient of the field: at each point M the gradient 
is normal (i.e. perpendicular to the tangent plane) to the level 
surface passing through the point M. 

Actually, as it is seen from Fig. 228, the surfaces u = C and u = 
= C + AC can be regarded as being almost plane near the point 

M if AC is sufficiently small, and besides ~ » ^ ~ . But it 

is clear that if l is directed along the normal to the surface the quan- 
tity As will assume its least value, and will therefore assume 

its maximal value. This implies our assertion. 

In particular, we see that the assertion enables us to solve the 
followdng problem: to find the equation of the tangent plane passing 
through a point M 0 (x Q , y 0 , z 0 ) of a surface (L) having an equation 
of the form F ( x , y, z) = 0. To solve the problem let us introduce 
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a scalar field in space by means of the equation u = F (. v , y, z). 
Then ( L ) becomes one of the level surfaces of the field because we 
p av0 xi — F '{x, ij, z) = 0 on the surface. Then the vector 


(grad u) Mo = 



k 


(the subscript “zero” indicates that the corresponding derivatives 
are taken at the point M 0 ) is perpendicular to the sought-for tangent 
plane. Hence, according to Sec. X.7 (see problem 2), we obtain 
the equation of the plane: 




The last equation can be put down as dF — 0. Let the reader think 
how we could deduce this equation 
in a direct way. 

A surface for which the tangent 
plane is to be constructed can be 
represented by an equation of the 
form z = / {x, y). Hero we can 




Fig. 229 


tan a ■ 


(JL) 

V dx /O’ 


tan ft : 


(JL) 

V du >o 


rewrite the equation as z — f {x, y) = 0 and denote its left-hand 
side by F (x, y, z). Then formula (5) is directly applicable, and 

thus we have -(g) o {x — x 0 ) - (g) o {y — y 0 ) + (2 — z 0 ) = 0, 
i.e. 

z ~ z o={ff) 0 ( x -^o) j t (- g -) 0 ( y - i / o ) (6) 

The right-hand side being equal to the total differential df, we 
thus obtain the geometric meaning of the total differential of a 
function of two independent variables. Namely, the differential 
is equal to the increment of the third coordinate of the point in 
the tangent plane (see Fig. 229). 

Take an example. Let us compute the gradient of a centrally 
symmetric field u = f (r) where r = | r | = Y ar -f if -J- z 2 . In 

24-0141 
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this case the level surfaces are concentric spheres with centre at the 
origin of coordinates (why is it so?). If we take two spheres for 
which the difference of their radii is equal to dr then the difference 
of the corresponding values of the function / which are taken on 
these surfaces will be equal to df. Therefore, the rate of change of 
the function in a direction which is transversal to the level surfa- 
ces (that is along a radius) is equal to Hence, 

grad u (r) = t° — -j- r (7) 

where r° = — is the unit vector in the direction of the vector r. 

(Let the reader obtain result (7)'on the basis of definition (2).] 

3. Implicit Functions of Two Independent Variables. Implicit 
functions of two arguments were discussed in Sec. IX. 13. Now we 
can approach them from a new point of view. Let us consider the 
equation 

F ( x , y, z) = 0 (8) 

in the vicinity of the point M 0 (■ x 0 , y 0 , z 0 ) at which the equation 
is satisfied. The equation defines a surface (L) in space passing 

through the point M 0 . If ) Q .¥= 0 (see condition (IX. 16)] then. 

by formula (2), the vector (grad F)m„ has a nonzero component in 
the direction of the z-axis. This implies that the tangent plane to 
( L ) passing through M 0 [which is perpendicular to (grad F)ju 0 1 
is not parallel to the z-axis. Therefore, near the point M 0 at which 
the surface (L) touches the plane the angle between the z-axis and 
the surface ( L ) is different from the right angle. Hence, in the vici- 
nity of M 0 equation (8) defines a relationship of the form z = z (x. y). 
This functional relationship is local (i.e. it is defined only near M 0 
or, as we say, “in the small”) because if we take a point which is 
not sufficiently close to M 0 we can encounter the case when there are 
several values of z corresponding to given values of x and y or when 
there are no such values at all (see Fig. 230). It should be noted that 
the condition for the existence of a system of implicit functions 
established in Sec. IX. 13 is also of a local character because it 
guarantees their existence only in the vicinity of the point in question. 

If (-^r-) = 0 then the tangent plane to (L) is parallel to the 

z-axis at the point under consideration (as it is at the point N 0 in 
Fig. 230). In such a case it can happen that even for some points 
( x > y) lying very close to N 0 equation (8) does not define a one-valued 
function z — z (x, y). For instance, we see that some values of x 
and y taken near the point N 0 yield two possible values of z but 
at the same time there are no such values at all for other values of 
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x and y because the surface is convex in the direction of the radius 

at the point N 0 . But if, for example, we have — ^ 0 at N 0 then 

near N 0 equation (8) defines a function y = y {x, z). Equation (8) 
can happen not to define any coordinate as a function of the other 
coordinates near a point belonging to a surface (L) represented by 
an equation F (x, y, z) = 0 only if we simultaneously have 



8F 

dy 


0 , 


dF 

dz 


0 


(9) 


at LUIS pUJLllU. B . 

Points at which. conditions (9) hold are called singular points of 
the surface (L). A “typical” point belonging to a surface defined by 



equation (8) is not singular because such points are found by sol- 
ving the system of four equations (8) and (9) in three unknowns 
x, y and z which is inconsistent (overdetermined) in the general 
case. Therefore most of the surfaces have no singular points. Among 
the well-known surfaces only conic surfaces possess singular points 
which are their vertices. 

4. Plane Fields. All the notions established for space fields are 
transferred with corresponding simplifications to plane fields (see 

the end of Sec. IX. 9). For instance, the gradient grad u = ~ i -f- 
du 

^~~dy i °* ^ ie 11 ( x > V ) is a vector lying in the x, y-plane. The 

gradient of a plane field is normal to the level line, that is to the 
curve represented by an equation of the form u (x, y) = const 
at each point (x, y) (see Fig. 231). In this case the 
meaning of the gradient implies that its modulus is ap- 
proximately inversely proportional to the distance between 


24 * 
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the level lines, that is the level lines are closer to each other in those 
regions where the gradient is longer. 

Naturally, the equation of the tangent line to a curve represen- 
ted by an equation of the form 

/ (x, y) = 0 (10) 

is obtained from (5) hy dropping the third summand. 

Equation (10) locally defines a function y — y (x) if f' v ^ 0. Sin- 
gular points of curve (10) are the points for which 

fx — 0 and j' y = 0 (11) 

Let us introduce a surface ( L ) having the equation z — f (x, y). Then 
curve (10) can he interpreted as the line of intersection of ( L ) hy 
the plane z = 0. If conditions (10) and (11) are fulfilled at a point 

• X 

(a) (b) (c) 

Fig. 232 

(a) Isolated singular point 
(6) Nodal point (c) Cusp 

formula (6) implies that the plane is tangent to the surface ( L ) 
at the point. In Sec. 9 we shall investigate the form of the line of 
intersection of a surface with its tangent plane near the point of 
tangency. We shall see that usually singular points of a plane curve 
are isolated points, nodal points (double points) or, more seldom, 
cusps (see Fig. 232). 

5. Envelope of One-Parameter Family of Curves. Let us consider 
a family of curves dependent on a single parameter C (a one-para- 
meter family). The general] form of an equation of such curves can 
be put down as 

F (x, y, C) = 0 (12) 

Making C assume a certain concrete value we isolate an individual 
curve from the family. It often happens that the disposition of the 
curves resembles Fig. 233. In such a case we say that the family 
possesses an envelope, that is a curve (which usually does not enter 
into the family) which touches some curve of the family at each 
of its points. To find the equation of an envelope we note that each 
of its points belongs to a curve of the family and equation (12) is 
therefore satisfied at each point. But, at the same time, in 
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moving along tlie envelope the value of the quantity C [determining 
the curve of the family which the envelope touches at a point (#, y) 1 
varies: C — C ( x ). Differentiating equality (12) with respect to x 
(aft'er the ordinate of the envelope which is a function of x has been 
substituted for y ) we obtain 

F' x + F‘ v y'en^+F'cC’^O (13) 

where y' cn veiope is the slope of the envelope (i.e. of its tangent line) 
at an arbitrary point M. But, the envelope touching a curve of the 
family at the point M, the slope of the envelope equals the slope 
of the curve, that is y' cur ve — y'enveiope- quantity y' c UTV e is found 
from equation (12) by differentiating with respect to x for a fixed C : 

F’ x + F’y y'curve = 0 (14) 

Hence, (13) and (14) imply that F'c C' x = 0. But, as it has been indi- 
cated, C (x) is a variable quantity and therefore, in general, C' x (x) =£ 

0. Consequently, we have 

Fb (x, y, C) = 0 (15) 

Thus, for the points of the envelope, equations (12) and (15) hold 
simultaneously. Eliminating C from these two equations we arrive 
at the equation of the envelope. 

Example. Let us consider the family of the trajectories of 
motion of a shell under assumptions enumerated in example 1 of 
Sec. II. 6, when the initial velocity v 0 is given, for different values 
of the angle of inclination a. Here a serves as a parameter of the 
family, and therefore, in order to find the envelope (see Fig. 234), 
we differentiate the equation of the family [equation (II. 11)] with 
respect to a: 

q x gx~ sin a 

cos 2 a i;g cos 3 a 

Expressing tan a from the last equation and substituting this value 
into the equation of the family we obtain the equation of the envelope: 


Hence, the envelope is a parabola, the so-called safety parabola (why 
is it called so?). 

Let us take one more example. As we know, all the normals to an 
evolvent touch the evolute (see Sec. VII. 26) and thus the evolute 
is the envelope of the family of all the normals to the evolvent. This 
property implies an approximate method of constructing an evolute: 
ve draw several normals to the evolvent and then trace their envelope. 
. ^ ® must take into account that if the curves belonging to a family 
m question possess singular points (see Sec. 4) then, when elimina- 
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ting C from (12) and (15), we obtain, besides the envelope, the curve 
which is the locus of singular points (see Fig. 235). Virtually, as it 
was shown in Sec. 4, we have F' x = F{J — 0 for such points and 



(a) Envelope (b) Locus of nodal 
points 


therefore in this case (13) implies (15) even if the slopes of the locus 
of singular points and of the curves of the family do not coincide, 

that is if yiacus H curve 


§ 2. Extremum of a Function of Several Variables 

6. Taylor’s Formula for a Function of Several Variables. For 
definiteness, let us consider a function / ( x , y) of two variables. 
(Similar results are valid for an arbitrary number of independent 

variables.) It turns out that formula 
(IV. 62) remains true for such a func- 
tion / without any changes. 

To prove the assertion let us take an 
arbitrary direction and draw a ray pas- 
sing through a point (a, b) in the x, 7/-plane 
in this direction (which is indicated 
by the arrow in Fig. 236). The values 
of the function / which are taken on 
this ray depend only on the single argu- 
ment p, i.e. f (x, y) = /* (p). We can thus 
apply formula (IV. 62) to the.function f*. 
We have A/* = A / but at the same time, 
in investigating the relationship between the differentials of /* 
and /, we must take into account the following fact. According to 
Fig. 236, we have 

x = a -V p cos qp, y = b -f- p sin qj (a, b, cp = const) (16) 
and therefore 

f* (p) = / {x, y) = } {a -}- p cos q>, b + p sin cp) 

Hence, in computing df, d 2 f, ... we consider the variables x 
and y to be independent whereas in computing df*, <Pf*, ... we 
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consider them to he dependent on p. As we know (see Secs. IX. 12 
and IX. 16), this does not matter for the first-order differentials, 
that is we have df* = df. but, generally, this is essential for higher- 
order differentials. But in our case formula (16) implies that 

drx = d z x = ... =0 and dry = d?y = . =0 

Therefore in this particular case formulas (IX. 23) and (IX.24) 
show that d 2 f* = drf and, similarly, cPf* = <Pf etc. 

Thus, formula (IV.62) for the function f* automatically implies 
the validity of the same formula for the function f (x, y). 

In practical applications we usually truncate the formula thus 
retaining only one or two terms. Then we get (see Sec. IX. 16) the 
formulas 

/ (a + h, b + k) = f ( a , b) + f’ x (a, b) li + f' y ( a , b) k + 

-f- the terms of the order of smallness not less than the second 

(relative to h and k ) (17) 

and 

/ (a + h, b + k) = f (a, b) + f' x ( a , b) h + f' y (a, b) k + 

+ j [fix (a, b)lr - f 2fxy (a, b) hk + f vy (a, b) k~] + 

+ the terms of the order of smallness not less than the third (18) 

As in the case of a function of one argument, formulas (17) and 
(18) can he applied if | h \ and | k | are sufficiently small because 
otherwise the formulas can lead to incorrect results. In all cases 
when we apply Tajdor’s formula we suppose that the corresponding 
derivatives exist and are finite. 

7. Extremum. As in Sec. 6. we shall take, for the sake of simpli- 
city, the case of a function z = f (x, y ) of two arguments. The de- 
finition of an extremum is similar to the one introduced for functions 
of one independent variable (see Sec. IV. 18). For instance, we say 
that a function z — f (x, y) has a maximum at “a point” (that is 
for certain values x =x 0 and y = y 0 ) if the value / (a; 0 , Po) is greater 
than all the “neighbouring’" values of the function /, i.e. than the 
values / (x. y) taken for x and y which are, respectively, sufficiently 
close to x 0 and y 0 . 

In this section we shall consider only extrema that are attained 
in the interior of the domain of definition of a function / and, besides, 
we shall suppose that the function / itself and its partial derivatives 
have no discontinuities. Fig. 237 approximately represents the dis- 
position of the family of level lines of a function / of this type (see 
oec. IX. 1) near its point of extremum. 

We can easily establish the necessary condition for an extremum. 

ndeed, if we fix y = y 0 and make x vary then the corresponding 
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point (z, g) in Fig. 237 will move along the straight line ll, and 
the function / = / (x, g 0 ) (regarded as a function of x ) will have 
an extremum at x — x 0 ■ The function / = / (x, y 0 ) depending only 
on x, we have f' x ( x 0 , y 0 ) = 0, according to Sec. IV. 18. This is 
a partial derivative since it is taken for a fixed y. We similarly 
consider the case when the point ( x , y) moves along the straight 
line mm and thus deduce the following necessary conditions for 
an extremum: 


fx (zo, yo) = 0, fy (zo, go) = 0 (19) 


(in the case of a function of a greater number of independent variables 
we must similarly equate to zero all the partial derivatives of the 

first order). A point (lying in the x, 
g-plane) at which conditions (19) hold 
is called a critical (stationary) point 
of the function /. Consequently, if the 
conditions imposed in the preceding 
paragraph hold all the points of extre- 
mum of the function f are its critical 
points. 

Conversely, let a critical point (x 0 , yo) 
of the function / (x, y) be found. Is 
it correct to assert that this point 
must be a point of extremum? If there 
is only one critical point in a certain 
region and if we are sure that some 
physical or other conditions guarantee 
the existence of the extremum then our answer is affirmative. In 
other cases we must apply some sufficient conditions which we are 
going to study now. 

As we know from Sec. IV.18, the necessary condition for an ex- 
tremum of a function / (*) of one independent variable which is 
expressed by the equality f (x 0 ) = 0 is at the same time “almost 
sufficient” because if in addition we have /" ( x 0 ) 0 the extremum 
at the point x = x 0 is sure to exist. But it turns out that for the case- 
of a function of several variables this is not so. If conditions (19) 
hold for / (x, y) and the partial derivatives of the second order of 
the function of two variables are different from zero at the point 
(x 0 , go) the extremum nevertheless may not exist. Thus, in the case 
of functions of many variables we can have a situation of a new type. 

For example, see the “graph” of the function z = f (x, g) = 
— x 2 y- depicted in Fig. 220. Conditions (19) yield a single 
stationary point in this case which is the point (0, 0). Evidently, 
there is 'a minimum at this point because / (0, 0) = 0 and z > 0 
at the other points. At the same time, for the function z = — x 2 -j- g 2 
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we have an essentially different case. Its “graph” is shown in Fig. 221. 
Again, the only critical point is the origin of coordinates. We have 
2 y- for x — 0, that is the function increases when we move from 
the origin along the p-axis in both directions and, as a function of 
one variable p, it has a minimum at the origin. But if y = 0 then 
2 = —x- and hence along the rc-axis the function decreases in both 
directions and has a maximum at the origin. Taking other straight 
lines passing through the origin we see that the function also has 
a maximum at the origin for one group of these straight lines and 
has a minimum for the other group (by tbe way the function as- 
sumes the constant value z= 0 on two straight lines which are y =- 
= 4-2). In such a case we say that the function has a minimax at 
the point in question. Hence, the function z = — x 2 + y 2 has neither 
a maximum nor a minimum at the origin although necessary con- 
ditions (19) are fulfilled in this case and the partial , derivatives 

M - on _ 

~ — —2 and — 2 are different from zero. 
ox 2 dy 2 

After discussing these examples let us proceed to investigate 
the functions of general form. Suppose conditions (19) are fulfilled 
at a point ( x 0 , p 0 ) for a function f (x, y). Let us see whether or not 
the function / has an extremum at the point. In order to do this we 
take Taylor’s formula (18) and put a = x 0 , b — y 0 in it. This results 
in 

A/ = / (a: 0 + h , p 0 + k) — / (x 0 , Po) — 

= 2 ( X 0> Po) k 2 -f- 2 fxy (x 0 , Po) hk + jyy [Xq, Po) ft 2 ] "f- 

4- the terms of the order of smallness not less than the third 

The terms of the first order of smallness relative to h and k do not 
enter in the result since the stationarity conditions (19) imply that 
these terms are equal to zero. The terms of the third order being con- 
siderably smaller than the terms of the second order for sufficiently 
small | h | and | k |, the sign of the right-hand side is determined 
by the group of terms of the second order. Hence the sign coincides 
with the sign of the quadratic form 

P (h, k) = f^x {xq, Po) h“ -{- 2 f xy ( Xq , Po) hk -j- fy V (a; 0 , Po) k 2 (20) 

(we do not put down the factor | here since it does not affect the 

sign we are interested in). Consequently, if the sum (20) is positive 
for all h and k (of course, except for h = k = 0 when it vanishes) 
then we have A / >0 for sufficiently small | h |, | k \ which implies 
hat / (a; 0 -}- h, po + k) >/ (x 0 , po). Hence, we have a minimum 
a the point (.r 0 , p 0 ) in this case. If the sum is negative then we 
likewise conclude that there will be a maximum at the point (x 0 , p 0 ). 
finally, if the sum can assume the values of both signs there will 
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be a minimax at the point {x 0 , y 0 ) and thus there will he no extre- 
mum. We cannot judge by the sum of the terms of the second order 
whether there is an extremum only when the sum can vanish for 
certain values of h and k which are not equal to zero hut 
does not change its sign (in particular, this is the case when the 
sum is identically equal to zero and does not enter into the expres- 
sion of A/). In such a case we must take into account the subsequent 
terms of Taylor’s formula and take the sum of the terms of the third 
order. Then a similar (but, of course, more complicated) investigation 
of the sum for the values of h and k which turn the sum of the terms 
of the second order into zero can indicate whether we have an extre- 
mum and so on. We are not going to carry out this investigation 
here. 

These conclusions can be similarly drawn for functions of any 
number of independent variables. But in the case of a function of 
two variables we can easily proceed to express the sufficient condi- 
tions for the extremum directly in terms of the values of the second 
derivatives at the point ( x 0 , y 0 ). To do this let us take h 2 outside 

the brackets on the right-hand side of (20) and denote ^ — t. Then 
we obtain 

P (h, k) = l(f xx ) 0 + 2 (/i,) 0 t + (j; v ) 0 t 2 ) h 2 (21) 

[the subscript “zero" indicates that the values of the derivatives are 
taken at the stationary point (zq, y 0 )J- As is well known from ele- 
mentary algebra, the polynomial in t inside the square brackets has 
two distinct real roots if its discriminant is positive, i.e. if 

(M - (&*)o ifyy)o >0 (22) 

In this case the polynomial changes its sign when passing through 
the roots, and we therefore have a minimax here. But if 

(f z>j)l (fxx)o i}yy)o < 0 

then the polynomial has imaginary roots and consequently it does 
not change its sign (why is it so?). Therefore we have an extremum 
in this case. To find out what is the sign of the right-hand side of 
(21) we put t = 0. Then we see that if 

(fxy)l - (ttx) 0 ifyy)a < 0, (f Xx ) 0 > 0 (23) 

then the right-hand side of (21) is positive for all t and thus, by 
the results of the preceding paragraph, the function / has a mini- 
mum at the point (x 0 , y 0 ). Similarly, if 

(/x y) 0 (fxx) 0 (fw) o ^ 0, (/x.c ) 0 <1 0 (24) 

then the function / has a maximum. Finally, if 

(/xy)o — (fxx ) 0 (fyy)o — 0 


(25) 
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then the polynomial entering into (21) has a double root and thus 
it does not change its sign but it can vanish for some nonzero values 
of h and k. Thus, this is an ambiguous case. 

The condition guaranteeing that quadratic form (20) should be 
positive can also be deduced from the general theory of quadratic 
forms (see Sec. XI. 11). According to the theory, after a certain 
rotation of the coordinate axes, form (20) has been transformed to 
a “diagonal form”, that is to the form 

P = X^' 2 + X 2 X' 2 (26) 

-where X t and X 2 are the roots of the characteristic equation 

(fxx) o f (fxy) o 

(f-ry)o ( fw)o f 

and h' and k' are the increments of the new coordinates (which occur 
after the rotation has been performed). Equation (27) implies that 

XlX 2 = (fxx)o (fyy ) 0 — (fxy) o> Xj + X 2 = (fxx ) 0 + (J'vy)o 
(check it up!). We can easily deduce from these equalities that in 
cases (22)-(25) we respectively have XjXa < 0 or Xj >0, X 2 >0 
or Xi < 0, X 2 < 0 or XiX 2 = 0 (we leave it to the reader to verify 
these assertions). From this, on the basis of equality (26), we deduce 
the same conditions as those in the preceding paragraph. 

In order to investigate the behaviour of a function / (aq, x 2 , ■ . • 

. . . , x n ) depending on an arbitrary number n of arguments at a 
stationary point we must take the quadratic form 

2 (fx^ohihj (28) 

i, 3=1 ' J 

instead of (20) and the equation 

det (A — XI) = 0 (29) 

[where A = ((Jx. Xj ) o)nJ in place of (27). If all the roots of equation 
(29) are positive then form (28) assumes only positive values at all 
points (hj, h 2 . . . ., hf) different from the point = h 2 — ... = 
— k n ~ 0. A quadratic form of this type is said to he positive definite. 
In this case the function / has a minimum at the stationary point 
in question. If all the roots of equation (29) are negative then form 
(28) is negative definite, and the function / has a maximum. But 
if equation (29) possesses roots of both signs then the function / 
has a minimax.* 

ca n be shown that an equation of form (29) with the matrix A = 
= Jo) has only real roots because the matrix is symmetric. See also 
Sec. XI. 10 on this question. — Tr. 
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8- The Method of Least Squares. As an example illustrating appli- 
cations of the theory of extremum of functions of several arguments 
we shall consider the least-square method which can be applied to 
constructing empirical formulas. In particular, the method is used 
when the accuracy of the approximate method described in Sec. 1.30 
is insufficient. It is also applied to automatizing calculations. Hero 
we shall dwell only on the problem of selecting a linear functional 
relationship in the case of one independent variable. In performing 
these calculations we usually apply the following way of reasoning 
which can also be applied to other functional relationships: the 
sought-for function is of the form y — kx -f- b but the values of 
the parameters k and b are yet unknown. The substitution of x = x- t 
into the formula should have resulted in kxt + b but the experi- 
mental data give the value y- t , and thus we have the difference 
yi — kxt — b between the theoretical and experimental data which 
is due to the errors of the experiment and of the calculations, the 
non-linearity of the relationship under consideration etc. This diffe- 
rence between the left-hand side and the right-hand side of a for- 
mula is called a discrepancy. 

Therefore, let us try to select k and b in such a way that the sum 
of the squares of these discrepancies, that is the quantity 

N 

S— 2 {yi—kxt — bf 

i=l 


should take on its minimal value among all the possible values. 
We can also take a sum of other even powers or, for instance, the 
sum of the absolute values of the discrepancies, but this will involve 
more complicated calculations. At the same time we must not take 
the sum of the discrepancies themselves because it can be small 
when the absolute values of the summands (which can be of 
different signs) are large. Thus we arrive at the problem of finding 
a minimum of the function S — S {k, b). Applying necessary con- 
ditions (19) we see that for the function to have a minimum, it is 
necessary that the equalities 

N 1 v 


Ah = — 2 2 (</i — kxi — b) xi — 0, 
i — 1 

should be fulfilled. It follows that 

N N N 

k 2 4 + b 2 *i= S x iUi 

i=l i=l t=l 


S'b = — 2 2 (yi — kxi — £>) = 0 
i=l 

N N 

* S Xi + bN= 2 l/i 
i — 1 i— 1 


Hence, we have obtained a simple system of two algebraic equa- 
tions of the first degree in two unknowns from which k and b can 
be found. All xi and y t being known, the system can easily be solved. 
The fact that in this way we really obtain a minimum of S is implied 
by the meaning of the problem in question. 
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Similar methods are applied to selecting other empirical formu- 
las and to some other problems. 

Let us take an example. Suppose that an experiment carried but 
in order to establish a relationship between some quantities x and 
y indicates that these relationships are approximately expressed 
by the equations 

x 4 y = 5.8 ) 

- - ' (30) 


x 4 2y — 8.1 
2x 4- 3?/ = 13.2 


1 


The system is formally inconsistent because adding together the 
first two equations we arrive at a contradiction with the third equa- 
tion. But this can be a result of the errors of the experiment! There- 
fore let us try to satisfy system (30) as precisely as possible so that the 
sum of the squares of the discrepancies should be as small as possible. 
Thus, we are going to find the values of x and y such that the quantity 

S = ( x + y- 5.8) 2 + (x 4 2y — 8.1) 2 4 (2x + 3y - 13.2) 2 

assumes its minimal value. Applying necessary conditions (19) 
we obtain 


S' x = 2 Cr+0— 5.8)+2 (*42y— 8.1)42 (2ar+3y-13.2) 2=0, 

Sy= 2 {x+y- 5.8)+2 (*42y— 8.1) 242 (2x43y—13.2) 3=0 

Cancelling out the factor 2 we get 

6a: 4 9y = 5.8 4 8.1 4 2 X 13.2 = 40.3 \ 

9a: 4 14 y = 5.8 4 2 X 8.1 4 3 X 13.2 = 61.6 J 

Solving the equations in the simplest way we find x — 3.3 and 
y = 2.3. Of course, these values only approximately satisfy system 
(30). Naturally, the greater the number of relationships of form (30) 
between x and y, the more reliable the values of x and y thus obtai- 
ned. Of course, this is so if there are no systematic errors in the 
experiment (random errors occurring in some relationships mutually 
cancel). Many other systems of approximate equations can be solved 
in like manner. In particular, the method can be applied to systems 
of empirical equations when the number of equations exceeds the 
number of unknowns. 

The method of least squares was discovered by the French mathe- 
matician A. M. Legendre (1752-1833) and by Gauss. It has many 
useful applications at present time. 

9. Curvature of Surfaces. The classification of stationary points 
described in Sec. 7 is directly related to the classification of surfaces. 

Let us consider an arbitrary surface (S) and take a point M on it 
(see Fig. 238). If we draw the normal nn to the surface at the point 
and then draw an arbitrary plane ( P ) passing through the normal 
tne plane will intersect ( S ) along a plane curve 11 which is called 
a normal section of (S) at M . The curve ll has a certain curvature k 
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at the point M (see Sec. VII. 24). Now if we rotate the plane ( P ) 
about the normal nn the normal section will vary, and its curvature 
k will therefore also vary, in the general case. To investigate the 

law of this variation let us choose a Car- 
tesian coordinate system so that the 
origin of the coordinates should be at 
the point M and the z-axis should go 
along the normal nn. Then in the vici- 
nity of M the surface ( S ) can be repre- 
sented by an equation of the form 
z = z (x, y), and the point M [at which 
z (0, 0) = 0] will be a stationary (criti- 
cal) point of the function z ( x , y) (why 
will it be so?). By arguments similar to 
the ones applied at the end of Sec. 7 
Fig. 238 we conclude that after a certain rotation 

of the coordinate axes about the axis nn 
has been performed the equation of (5) turns into the form 

z = — ift^x' 2 + k 2 y' 2 ) -f- the terms of higher order of smallness (31) 

where x' and y' are the new coordinates replacing x and y. Let the 
plane ( P ) form an angle <p with the plane x'Mz. Then passing to 
polar coordinates we obtain x’ = p cos q> and y' — p sin 9 which. 
yield 

z — y (hi cos 2 cp + h 2 sin 2 9) p" + . . . 

Hence, at the point M we have — = 0 and cos 2 9 

+ sin 2 9. Formula (VII. 37) thus implies the expression k = 
= | ?ii cos 2 cp + sin 2 cp | for the curvature. Accordingly, there 
can be three cases here, namely the following cases: 

1. Let > 0, i.e. let and be of the same sign. Then all 
the normal sections have the same direction of convexity near the 
point M, and the values of k lie within the limits | \ and | I* 

Besides, we have k — \ h, | for 9 = 0, that is for the plane x’Mz, 

and k — | X 2 | for 9 = ~ , that is for the plane y'Mz (these are 

the so-called principal normal sections). A point M of this type is 
called an elliptic (or umbilical in case | | = [ |) point of the 

surface ( S ). For instance, all the points of an ellipsoid or of a hyper- 
boloid of two sheets are elliptic. Equation (31) implies that the 
tangent plane to ( S ) at the point M has only one common point M 
with the surface ( S ) near the point M. The planes parallel to the 
tangent plane which are drawn sufficiently close to it and which 
intersect ( S ) yield the intersection lines of the form of an infinitesi- 
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mal ellipse whose axes lie in the planes of principal normal sections. 
The form of the sections can become more complicated when the 
distance from the plane (parallel to the tangent plane) to the point 
M is increased (see Fig. 239a). 

2. Let < 0. Then some of the normal sections have a posh 
tive curvature at the point M and the direction of convexity coin- 
ciding with the direction of the outer normal to the surface near 
the jToint M, and some other sections have a negative curvature 
and the opposite direction of convexity. For instance, this is the 





Fig. 239 

vim and gg are principal normal sections 


case for the points of a hyperboloid of one sheet. One of the sections 
belonging to the first group has the maximal curvature | | (or | X 2 |) 

and one of the sections of the second group has the maximal curva- 
ture ] X 2 | (or, respectively, | |). These principal normal sections 

are also mutually perpendicular. A point M of this type is called 
a hyperbolic point of the surface. The tangent plane to ( S ) at the 
point M intersects the surface ( S ) along two curves which form 
a nonzero angle (i.e. the angle between the tangent linesto the curves) 
at M. The planes parallel to the tangent plane which are drawn 
infinitely close to M intersect ( S ) along hyperbolas lying infinitely 
close to M, the axes of the hyperbolas being directed along the 
principal normal sections (see Fig. 239 b). 

3. Let = 0. Then, if and X 2 are not simultaneously equal 
to zero all the normal sections have the same direction of convexity 
near M and have a nonzero curvature at M except for one of the 
sections which has the zero curvature at M. The curvature of the 
section which is perpendicular to the section of zero curvature is 
the maximal at the point M. A point of this type is called a para- 
bolic point. For example, all the points of a cylindrical or conical 
surface are of this type. A typical disposition of lines of intersec- 
tions of a surface ( S ) with the planes parallel to the tangent plane 
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drawn at a parabolic point M of the surface is shown in Fig. 239c 
but there can be some other dispositions in different cases. The case 
when hi = = 0, that is when fc = 0 for all cp, also belongs to 

this type. In such a case a point of this type is called a planar point 
of the surface ( S ). It is clear that all the points of a plane are of 
this type. 

In the above concrete examples all the points of each of the sur- 
faces were of the same type but this must not be necessarily so in 
the general case. For instance, the surface of a torus has points 
belonging to each of the three types (where are these points placed 
on the surface of a torus?). 

In all cases the product is called the total (Gaussian) cur- 
vature of the surface (S) at the point M. There is a remarkable pro- 
perty of the total curvature: when a surface is bent without stret- 
ching its total curvature does not change. For instance, if we take 
a sheet of paper and bend it in an arbitrary way the surface thus 
obtained will have the zero total curvature at each of its points. The 
same cause makes it impossible to flatten a portion of a sphere with- 
out deformation, and there cannot therefore be a geographic map 
without distortion. 

A surface can have singular points (usually these are isolated 
points or “conical” points similar to the vertex of a circular cone 
but of course there are also singular points of more complicated 
types). It can also have singular lines which are loci of singular 
points (most often these lines are isolated lines or lines of self-inter- 
section; there are also “cuspidal edges” and some other types of sin- 
gular lines). 

10. Conditional Extremum. In the problems considered in Sec. 7 
we investigated extrema in the case when independent variables 
were not connected by any additional relationships. An extremum 
of this kind is called an unconditional extremum. But there are 
also problems concerning a so-called conditional extremum when 
arguments are related to one another by relationships of the form 
of an equality. We begin our investigation with functions of two 
independent variables. 

Suppose we seek for a maximum or a minimum of a function 
z — f (x, y) on the condition that x and y are restricted by the 
relationship 

F {x, y) = h (32) 

Equation (32) is called a coupling equation [conditions of form (32) 
are also called subsidiary [conditions, side conditions or constraints]. 
Thus, we consider and compare only those values of the function / 
that correspond to’the points (lying in the x, y-plane) which belong 
to the curve represented by equation (32). For instance, in Fig. 240 
we see level lines of a function f. At the point K the function attains 
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its maximum (unconditional). At the same time there are three 
conditional extrema here, namely two maxima at the points A 
and C and one minimum at the point B (think why it is so). An 
unconditional maximum can he compared to the top of a mountain. 
Then it is natural to compare a conditional maximum to the highest 
point of a mountain path whose projection on the x, y-plane has 
an equation of form (32). 

If it is possible to express y in terms of x with the aid of equation 
(32) then we can substitute the result y = y (x) into the expression 



Fig. 240 


of z and thus obtain z as a function of a single independent variable: 


z = / [x, y (a;)] (33) 

When substituting y = y (x) we have taken into account condition 
(32), and, since there are no other restrictions, the problem reduces 
to finding an unconditional extremum of z = f [x, y (a:)]. There 
will be a similar situation if it is possible to solve equation (32) 
for x or if the curve defined by equation (32) can be represented by 
parametric equations. 

It should be noted that the above resolution of equation (32) 
may not be possible in some cases and, besides, this can be incon- 
venient and can lead to complicated expressions even when equation 
(32) is solvable. In such a case we can reason in the following way. 
Coupling equation (32) defines some relationship y — y (x) which 
may not be known in an explicit form. Consequently, z is a.composite 
function of the independent variable x of form (33). The necessarv 
condition for an extremum can therefore be put' down in the form 


dz 

dx 


■r.+A£=o 


25—0141 


(34) 
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on the basis of the formula for the derivative of a composite function. 
Here the expression designates the derivative of the implicit 


function y ( x ) defined by condition (32). Hence, by Sec. IX. 13, we 
have 


p' _n F' — — 0 


i.e. 


dy 

dx 



Substituting this expression into (34) we see that at the point of 
a conditional extremum we have 


K 


/;— ybf v =o, 

r v 


i.e. 


I'v 


F' x 


F'y 


Let us denote the value of the last ratio by X. Then we have 



(35) 


at the point of a conditional extremum. This can be rewritten as 
f' x - m = 0, /' - XF' - 0 (36) 


Let us introduce the notation 

/* (x, y, X) = f ( x , y) — XF (: x , y) (37) 

where X is an undetermined parameter which is called Lagrange’s 
undetermined multiplier (factor) named after Lagrange who intro- 
duced this method. Then equations (36) can be put down in the form 


IV = 0, ft' =0 (38) 

Thus, we have arrived at equations of the same form [see equations 
(19)] but for the modified function /* defined by formula (37) instead 
of the original function /. Equations (38) together with the coupling 
equation (32) form a system of three equations in three unknowns 
which are x, y and X. The points in the x, y-plane at which a con- 
ditional extremum may be attained are found from these equations. 
The conditions thus obtained are only necessary. Sufficient con- 
ditions guaranteeing the existence of a conditional extfemum at 
a point defined by equations (38) and (32) can similarly be deduced 
from the sufficient conditions for an unconditional extremum estab- 
lished above but we shall not do this here. [By the way, in our case 

TO — 

it is sufficient to compute the derivative |^and to investigate its 

sign at the point defined by (32) and (38).] 

Lagrange’s multiplier X has a simple meaning. To illustrate it 
denote the coordinates of the point of a conditional extremum by 
x and y. Let z designate the corresponding extremal value of z. 
Up till now the quantity h entering into equation (32) has been 
considered to he fixed. But now we can vary h and then all the 
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three quantities x, y and z will also vary, i.e.jr, y and z will become 
functions of h. The identity z (h) = / [2 ( h ), y (h)] holding, we have 


dz (t die 1 rt dy 

lh =Jx ‘lh nrhj ~dh 

(39) 

On the other hand, by (32), we have 


Fx 1h +Fy dh — 1 

(40) 


From (39), (35) and (40) we readily deduce 4^ = X. Hence, the 

factor X is equal to the rate of change of the extremal value z when 
the parameter h entering into coupling equation (32) varies. 

The investigation of a conditional extremum in the general case 
of an arbitrary number of independent variables and any number 
of coupling equations is carried out in like manner. [We remind 
the reader that according to Sec. X.2 t\je number of coupling equa- 
tions (subsidiary conditions) must be less than the number of argu- 
ments.] 

For instance, if we are looking for an extremum of a function 
/ ( x , y, z, u, v) when the arguments x, y, z, u and v are restricted 
by the conditions 

Fj (x, y , z, u, v) = 0, F« (• x , y, z, u, v) = 0 and 

F 3 (x, ij, z, u, v) = 0 (41) 

we must perform calculations as if we wanted to find an uncondi- 
tional extremum of the function 

/* = / — F j — F 2 — ■ X 3 F 3 

where Xj, X 2 and X 3 are undetermined Lagrange’s multipliers. The 
stationarity condition for f* yields the equations 

}%' = 0, f*' = 0, ff = 0, ff - 0 and /*' - 0 

which, together with equations (41), form a system of 8—5 -j- 3 
equations in 8 = 5 + 3 unknowns x, y, z, u, v , Xj, and A, 3 . 

Methods of solving a problem of a conditional extremum form 
the basis of one of the widely spread numerical methods of fin ding- 
an unconditional extremum. This is the so-called method of steepest 
descent. Here we shall describe a variant of the method applicable 
to the case of a minimum of a function of two arguments although 
in the general case different modifications of the method can be 
applied to finding minima (or maxima) of functions of any number 
of independent variables. Let it be necessary to find a point of 

25 * 
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minimum of a function / ( z , y). It turns out that when the function 
/ does not belong to a certain class of the simplest functions it is 
rather difficult (and even impossible) to solve equations (19). Be- 
sides, in solving equations (19) we find all the stationary points in- 
cluding those which are not points of minimum and hence we do 
much unnecessary work. It is therefore better to apply the method 
of steepest descent which is an iterative method. We begin with 
taking a point M 0 (x 0 , y 0 ) as a zeroth approximation. At this point 
the function / has the maximal rate of decrease in the direction of 
the vector — (grad f) S[o = —f' x ( x 0 , y 0 ) i — f y ( x 0 , y 0 ) j (because, as 
was shown in Sec. 1, the direction of the vector grad / indicates 
the direction of the maximal rate of increase of the function). Let 
us draw a ray through the point M 0 in this direction and consider 
the values of the function / which are taken on the ray. These values 
are expressed by the quantity / (x 0 fx oL Vo — fyot) which we 
regard as a function of t for t > 0. After that we find a value of t 
for which this function of one variable attains its minimum. This 
value of t determines a new point Mi [x lt y t ). Then we draw a ray 
through the point Mi in tine direction of the vector — (grad f) 3Il 
{(grad f) itl designates the gradient at the point M^ and find a point 
ilf 2 at which / attains its minimum on the ray etc. In many cases 
this method enables us to find an approximate position of the sought- 
for point of extremum within a sufficient accuracy after several 
steps of this kind have been carried out. (We suggest that the reader 
should consider level lines of a function / on the x, p-plane and 
find out the geometric meaning of the method.) Methods of this 
type which enable us to find extrema without using necessary con- 
ditions are called direct methods. 

11. Extremum with Unilateral Constraints. Independent variables 
involved in an extremal problem can be restricted by one or several 
conditions of the form of an inequality. Such conditions are called 
unilateral (one-sided or non-restricting) constraints. For example, 
let an extremum of a function f (x. y) be sought for and let the in- 
dependent variables be related to one another by the constraint 
F (x, y) 0 which defines a domain (S) with a boundary ( L ) 
in the x, p-plane (see Fig. 241). The curve (L) is defined by the 
•equation F — 0. The function f can have both extrema attained 
in the interior of (S) and extrema attained on (L). In order to find 
the former we can use conditions (19) defining stationary points 
hut these conditions do not apply to extrema on the boundary ( L ). 
To find the latter extrema we remark that if the function / has an 
extremum at a point M belonging to (L), for instance, a minimum, 
then the value f {M) is smaller than all the values of / taken on ( L ) 
near M. Therefore there will simultaneously be a conditional mini- 
mum of / at M for the coupling equation F = 0. Hence, such points 
can be found with the help of the methods described in Sec, 10. 
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These results can be applied to the problem of finding the greatest 
and the least values of a function. For instance, let a function z — 
= f ( x , y) be considered in the domain depicted in Fig. 242. suppose 
that neither the function nor its derivatives have discontinuities 
in the domain. If the point at which the function attains its greatest 
value lies in the interior of the domain then there will be an uncon- 
ditional maximum of the function at this point. If the point belongs 
to the contour bounding the domain but does not lie at the vertices 
A, B and C then there will be a conditional maximum of the func- 
tion at this point, and the equation of the corresponding arc of the 
contour will serve as condition (32). Finally, if the greatest value 




is attained at a vertex {A, B or C) then in order to find this’greatest- 
value we must additionally compare the values of the function at 
the points A, B and C with its other extremal values. 

Thus, to find the greatest value we must find all the points of 
unconditional maximum lying in the interior of the domain and 
all the points of conditional maximum belonging to the contour. 
Moreover, in case it is difficult to specify beforehand which of the 
possible points of conditional extremum will be the points of maxi- 
mum it is advisable to find the values of the function at all these 
points. After that we compare with each other the extremal (maxi- 
mal) values taken in the interior of the domain, the extremal values 
attained on the contour and the values of the function at the points 
A, B and C and thus find the sought-for greatest value of the func- 
tion. The least value of a function is found in like manner. As in 
Sec. IV.19, it is better to seek the greatest and the least values of 
a function simultaneously. 

If the domain under consideration contains points of disconti- 
nuity of the partial derivatives of the first order the values of the 
function at these points should be included into the set of values 
which we compare with each other because it can turn out that the 
greatest (or the least) value of the function is attained at some of these 
points. If there are points of discontinuity of the function then we 
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should additionally investigate the behaviour of the function in 
approaching such points. If there are lines of discontinuity of the 
function or of its derivatives then the values of the function taken 
on such lines must also be investigated which leads to the problem 
of a conditional extremum. Finally, if the domain in which we 
investigate the function extends to infinity we must additionally 
investigate the behaviour of the function when the variable point 
in the x, y-plane approaches infinity. 

Functions of a greater number of independent variables are inve- 
stigated in a similar way. But in performing such an investigation 

we must take into account some addi- 
tional factors which are due to the 
increase of the dimension. For instan- 
ce, if we consider a function u — 
= / ( x , y, z) in the domain shown in 
Fig. 243 then we must consider its 
unconditional maxima in the interior 
of the “curvilinear tetrahedron”, its 
conditional maxima with one coupling 
equation on the “faces” of the tetra- 
hedron (in this case the equations of 
the corresponding surfaces serve as 
coupling equations), the conditional 
maxima with two coupling equations 
on the “edges” of- the tetrahedron 
[the role of the coupling equations can be played by the equa- 
tions of the corresponding space curve if they are put down in 
form (X.2)] and, finally, the values of the function at the “verti- 
ces”. Comparing all these values with each other we find the greatest 
value of the function. Similarly, if there are surfaces of disconti- 
nuity of the function we come to the problem of a conditional maxi- 
mum with one coupling equation, and if there are lines of discon- 
tinuity we come to the problem of a conditional extremum with 
two coupling equations etc. 

If we apply an iterative scheme of the type of the method of 
steepest descent described in Sec. 10 then in the case when there 
are several minima (maxima) in the domain in question we can 
come to a minimum (maximum) which is not the least (greatest). 
In such a case it is advisable to apply the method several times 
beginning with different zeroth approximations chosen at random. 
For instance, after such repeated calculations have been carried 
out we can obtain a smaller minimum than the one found in the 
first calculation, and in many problems this may finally lead to the 
least value of the function. 

12. Numerical Solution of Systems of Equations. In conclusion 
we shall consider some methods of numerical solution of a system 


z 
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of two equations in two unknowns. The case of a system of n equa- 
tions in n unknowns can be treated similarly. 

The iterative method has the same form as in Secs. V.3 and VI. 5. 
To apply the method we rewrite the system in question in the form 

x=f(x, y) | (42 ) 

y = g fa y) J 

Then we choose a zeroth approximation x = x 0 , y — y 0 . The sub- 
sequent approximations are constructed according to the formulas 

x i — f (*o» Vo) 1 x 2 = / Vi) 1 etc> 

yi — g ( x <» y o) 1 ’ y» = 8 fe- Vi) J 


If the process is convergent then in the limit we obtain a solu- 
tion of system (42). The smaller the rate of change of the functions 
/ and g when their arguments vary (that is the smaller the absolute 
values of the partial derivatives of the functions), the better the 
convergence of the process. 

A modification of the method (the Seidel method) which is based 
on using some numerical values obtained at each step of the calcu- 
lations for computing other values at the same step may sometimes 
accelerate the convergence-. Such calculations are performed accor- 
ding to the following scheme: x, = / (x 0 , y Q ), y l — g (x u y 0 ), 
x 2 = / (xi, iji), z/ 2 = g (x„, ijt) and so on. 

Newton’s method (see Sec. V.2) is based on the replacement of 
given functions by their linear approximations constructed with 
the help of the values of the functions and their derivatives for 
the values of the arguments approximately equal to the sought- 
for solutions. Suppose we have to solve a system of equations of 
the form 


P (x, y) = 01 

Q (x, y) = 0 j 


(43) 


Let us begin with a zeroth approximation x — x 0 , y — y 0 to the 
sought-for solution which can be found by means of an approximate 
sketch of curves (43) in tlie'x, y-plane or which can be implied by 
the physical meaning of the problem and the like. Taking expansions 
(17) of the functions P and Q into powers of h = x — x 0 and k — 
— y J/o an d dropping the terms of higher order of smallness we 
arrive at the following system of equations: 

p ( *o » y o) + P'x (x 0 , y 0 ) (x — x 0 ) + P' y (x 0 , y 0 ) (y — y 0 ) = 01 
Q ix 0 , y 0 ) + Q x (x 0 , y 0 ) (x — x 0 ) + Q' y ( x 0 , y 0 ) (y — y 0 ) = 0] 

System (44) approximately replaces system (43). Solving (44) which 
is a system of linear equations we obtain the values of the first 
approximation x = aq and y = y t . The second approximation is 
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found from system (44) after x t and i/j are substituted for x 9 and y 0 
in it and so on. The relationship between the nth and (n -f- i)th 
approximations is of the form 

P Vn) P x (®n» Hn) i^n+ 1 ^ji)~\~Pv ’ Uri) (l/n-H Z/ti) == 01 
Q (®n> Z/n)"r(?2 (p in Vn) (p'n+l (^ni Z/ti) (Z/ti+ 1 Z/n) ~0j 

If the process is convergent we pass to the limit as «->- oo. In 
the limit the last two summands in each of the equations vanish 
and thus we see that the limiting values satisfy system (43). It can 
be shown that if the initial approximation is chosen sufficiently 
close to the sought-for solution and if the Jacobian (see Sec. IX. 13) 

a / — ' is unequal to zero, i.e. if 

O (x, y) ^ 

D (P, Q) , o 

D (x, y) u 

then the approximations are sure to converge. [How is the Jacobian 
related to system (44)?] 

We can also take advantage of the fact that a solution of system 
(43) simultaneously gives minimum to the function V (x, y) — 
= [P (x, y)] 2 -f [(? (x, y)] 2 (why is it so?). Instead of this function 
we sometimes take a similar expression in which there are certain 
numerical positive coefficients in front of the squares of the func- 
tions which are introduced to balance the “significance” of both 
equations (43). Taking a function V of this type we then find its 
minimum with the help of one of the direct methods mentioned 
in Secs. 10 and 11. [Obviously, it is senseless to apply the necessary 
conditions for an extremum to the function V because this will 
lead us back to system (43). Let the reader verify this assertion.] 
If the minimal value thus found is equal to zero the point of the 
minimum yields a solution of system (43). 



CHAPTER XIII 


Indefinite Integral 


§ 1. Elementary Methods of Integration 

1. Basic Definitions. Let the function / (x) be the derivative of 
a function F ( x ), i.e. F' (x) = / (x). Then F (x) is said to be an 
antiderivative (or primitive) of / (x). For instance, the function 
3x 2 is the derivative of x 3 , and x 3 is an antiderivative of 3x 2 . 

Differential calculus deals with the basic problem of finding the 
derivative of a given function and with the problem of finding its 
differential which is directly related to the former problem. For 
functions of one independent variable, this problem was considered 
in Chapter IV. In particular, as it was shown in Sec. IV. 5, the deri- 
vative of any elementary function is an elementary function which 
is found by means of standard rules. 

The main problem of integral calculus is reverse to that of diffe- 
rential calculus. This is the problem of finding a function when the- 
derivative of the function is given, that is the problem of finding- 
antiderivatives of a given function. The significance of the problem 
will be discussed in Chapter XIV. This problem is more complicated 
than the problem of differentiation. (As a rule, “reverse” problems 
are more complicated than the “direct” ones. For instance, the- 
problem of extracting a root is more complicated than the problem 
of raising to a power.) In particular, we shall see that although the 
antiderivative of any elementary function exists (and these are 
the functions that we shall deal with in the present chapter) it 
may not be an elementary function. 

A given function possesses more than one antiderivative. For 
example, we have not only (x 3 )' = 3x 2 but (x 3 + 5)' = 3x 2 as well. 
(It often turns out, in different examples, that solutions of reverse- 
problems are not unique.) In general, if a function / (x) has 
antiderivatives F t (x) and F 2 (x) then F[ = f and F'„ — /, i.e. F[ — 
~ ' F'z — 0, (Fi — F„y — 0, and thus we have F t — F„ = const 
(see Sec. IV. 17) and F 1 — F 2 '-\- const. Consequently, any two anti- 
derivatives of the same function differ in a constant summand. Hence, 
m order to obtain all the antiderivatives of a given function it is- 
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sufficient to take one of the antiderivatives and to add an arbitrary 
constant to it. For instance, the family of all antiderivatives of the 
function 3x 2 is given by the formula x 3 + C where C is an arbitrary 
constant. Making C assume concrete numerical values we obtain 

particular antiderivatives: x 3 , x 3 -f 5, x 3 — }/~2, x 3 + etc. 

The family of all antiderivatives of a function / (x) is called the 
indefinite integral of the function / (x) and is denoted by the symbol 

j / (x) dx whose meaning will be discussed at length in Sec. XIV. 2. 

Here | is the integral sign, / (x) is the integrand and / (x) dx is the 
element of integration. Thus, 

if F' (x) = / (x) then j / (x) dx = F (x) -f- C and 

vice versa (1) 

For instance, j 3x 2 dx = x 3 + C. In other words, the indefinite 

integral is the general expression of antiderivatives which involves 
an arbitrary constant, and every concrete numerical value of the 
■constant yields a certain concrete antiderivative. 

Formula (1) implies that 

(j/(x) dx)' = /(x), d(\ f (x) dx ) = / (x) dx, 

j (dF(x)) = F(x) + C (2) 

Therefore, the signs of integration and differentiation mutually 
•cancel out. The result of computing an indefinite integral can always 
be verified by finding the derivative of the result. If the answer is 
correct the differentiation must yield the integrand. To each for- 
mula of differential calculus (see Secs. IV.4-5) there corresponds 
a certain formula of integral calculus. 

2. The Simplest Integrals. These integrals are obtained by rever- 
sing formulas for differentiation of basic elementary functions (see 
Sec. IV.5). For instance, formula (sin x)' = cos x implies 

j cos x dx — sin x + C (3) 

Isee formula (1)]. The formula (cos x)' = — sin x, or ( — cos x)' = 
= sin x, implies 

f sinxfix= — cosx-j-C 1 
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Similarly, 

f —1- dx — tan x + C (this can be written as j = tan x + c) , 

cot x + C and [ — =- = arc sin x + C (4) 

J sin 2 x J y l — 12 

From formula (arc cos x)' - 


r dx 

3 yr=±s _ 


V*-* 


we deduce 


- arc cos x-\-C 


(5) 


At first glance one can think that the latter formula contradicts the 

Jl 

former. But this is not so because formula arc sin x + arc cos x = y 
(see Sec. IV. 18) and formulas (4) imply that 


dx 


yr 


= arc sin x-i -C-- 


— arc cos x-f 


■ arc cos x-j-Ci 


where ^ + C. Thus, the matter is that the right-hand sides 

of formulas (4) and (5) contain different arbitrary constants. Such 
a discrepancy in different forms of an answer can appear in many 
other examples of indefinite integrals. Naturally, in concrete com- 
putations one must choose either formula (4) or formula (5); for 
instance, we can choose formula (4). 

Other formulas of differentiation imply 

j yjjg 2 - arc tan x + C an d \ =- In x + C 

A 

A disadvantage of the last formula is that the function — whose 

antiderivative is being computed exists both for x > 0 and for 
x < 0 whereas the right-hand side is defined only for x > 0. But 
we can easily verify that there is a more general differentiation 

A 

formula of the form (In | x |)' — — . Indeed, we have . | x | = x 

for x > 0 and therefore our formula yields the ordinary derivative 
of the logarithmic function in this case, and we have | x | = — x 
for x < 0 and hence, for this case, we have (In | x |)' = (In ( — x))' = 

= — - (—1) = — • Therefore, formula 

j ^ = ln|x|+C 
is valid both for x 0 and for x < 0. 


(6) 
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Further, we deduce 


and 


J 


a x dx- 


In a 


J x n ~ 1 dz = ^- + C, 


^in particular, j e x dx = e x -f- Cj 

S -rTH+l _ 

x dx = + C 


Of course, the last formula does not hold for m = — 1 because the 
denominator vanishes in this case. But for m — — 1 the integral 

turns into j ^ which is computed by formula (6). 

Further, we have 


j cosh xdx = sinh x -f C, j sinh xdx = cosh x + C, 


I 


J 


dx 


cosh 2 x 


tanhx-4-C and 


dx 


V**+i 


= sinh -1 x -f C = In (x + ]/", x 2 -f- 1 ) + C 


(the formula for sinh -1 x was deduced in Sec. 1.28). 

The formula deduced above can be obtained without hyperbolic 
functions if we directly put it down in the form 


dx 


— In {x 4- Y ~r l) + C 


J + l 

and then differentiate the answer. Moreover, we have (In | u |)' 

— — u' x and thus 
u 

f — -= = =]n\x a \-\-C ( a — const) 

J ~]/x 2 -j-a 


because 

(in \x + Y x 2 +a\)' -■ 


i (l j 2j \ 

- ~[/x 2 + a ' ' 2 ~\/ x 2J r a > 


2 -\r x 


~[/ x 2 -i-a 


Avhere a is a constant of an arbitrary sign. 

The above formulas form the table of integrals of the simplest 
functions (tabular integrals). The reader should put down the table 
and learn it by heart because it is widely used for computing inte- 
grals. In particular, the formulas enable us to find the integrals of 
certain functions which can be readily reduced to tabular integrals. 
This is the so-called direct integration which is based on these for- 
mulas. To find an integral of this type we take a suitable tabular 
integral and try to change the answer in such a manner that its 
derivative should be equal to the integrand. Essentially, this method 
reduces to applying formula (1). For example, in order to find the 
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integral 

| cos 3 xdx ’ (7) 

it is natural to take formula (3). But the answer sin 3a: + C does 
not apply since the derivative of sin 3a : + C is equal to 3 cos 3a: 
but not to cos 3a: which is required. But if we divide sin 3x hy 3 the 

1 /I \ f 

derivative will be multiplied hy y . Hence, ^ysin3x + C) = 

= cos 3x, i.e. j cos 3x dx = y sin 3x -f C. 

In like manner we find 



dx _ 1 

Vl — (2*+5)2 _ 2 

— In | x — 3 1 -}- C , 


arcsin (2x-{- 5) ~j-C, j ~x^Z ~ 



— }/~2 arc tan -^p=- + C (8) 

(check up the answers by means of differentiation!). 

Generally, if an integral j / (x) dx = F (x) + C has been found 
then 

j / (ax -{-b)dx — -~F (ax + b) + C 

where a and b are arbitrary constant numbers. 

3. The Simplest Properties of an Indefinite Integral. These pro- 
perties are implied by the analogous properties of a derivative (see 
Sec. IY.4). For instance, 

j [/ (x) ± <p (x)] dx = j” / (x) dx ± | (p (x) dx (9) 

that is the indefinite integral of an algebraic sum is equal to the 
sum of the integrals of the summands. To prove the property we 
take the derivatives of the left-hand side and right-hand side and 
then verify that the results will be equal on the basis of the first 
formula (2) and on the basis of the well-known property of deriva- 
tives which asserts that the derivative of a sum is equal to the sum 
of the derivatives of the summands. The derivatives being equal, 
the corresponding functions can differ only in a constant summand. 
Thus the proof is completed because the constant must not be put 
down in formula (9) since the integral signs include arbitrary con- 
stant summands. 

We similarly verify that 

j Af (x) dx = A j / (x) dx (4 = const) (10) 

Thus, a constant factor can be taken outside the integral sign. 
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A given integral can often be represented in the form of a s um 
of tabular integrals with the help of formulas (9) and (10). Then we 
perform termwise integration and thus obtain the answer (this is 
the decomposition method). Let us consider several examples; we have 

J (3a: 3 — 2x + 5) dx — ^ (3a: 3 ) dx — j (2a:) dx-\- j 5 dx = 

= 3 j x 3 dx — 2 j xdx+5 j dx = 3^--2 ^-+5x + C = 

= -|-x 4 — x 2J rbx-\-C (11) 


(of course, we have put down only one arbitrary constant here be- 
cause a sum of arbitrary constants is an arbitrary constant), 


dx 1 1 

1 dX - 1 I 

’ dx 

i 

a 2 + x 2 a 1 J 

I . , * 2 a 2 ' 

a 2 

>+l 

a J 


= Ar a arc tan — 4- C = — arc tan — 4 -C 
a 2 a ' a a 


[compare with example (8 ) ] and, similarly, 

)-yfer“ arcsi "-r+ c <*»» («> 


(verify the answer!). 
Other examples are 


f tan 2 xdx— \ SI ° 2 Z dx = f — — ^ x - dx = 

J J COS 2 a: J COS 2 a: 

"J 1 )^=J J dx = tan x-x + C, 


t — T~~~iT dx = ( * 7~~ A ' dx -- 

J X(x—i) J x{x— 1) 


( 1 - 

1 

l x-l 

X 


i)da: = 


= In | x — 1 1 — In | x | + C = In 


X 1 

X 


■ \~c , 


2a 


x 2- 


■ dx 


( 1 - 

i 

\ x — a 

x-|-a 


f (j~r a ) 

— J 2 a(x — 

— ) dx — ~ 
a / 2c 


-(* — o) 


a) (x+a) 


dx = 


In 


a 




In computing the last two integrals we have applied a ge- 
neral method of representing a given fraction in the form of a sum 
of simpler fractions. When using the method we factor the denomi- 
nator and then try to represent the numerator as a combination 
of factors entering into the denominator. If this is possible we get 
a decomposition of the fraction into a sum of several fractions and. 
perform cancellation in each of the summands. 
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Let us take one more useful example. Let it be necessary to find 
the integral j sin 5x cos 3x dx. From trigonometry we know the 


formula 

Therefore. 


sin a cos P = -- [sin (a + P) + sin (a — P)] 


^ sin 5a; cos 3a: dx — sin — ^-— n — dx = — cos 8a; — cos 2a; + (7 
In similar circumstances we also utilize the formulas 
sin a sin p = [cos (a — P) — cos (a -j- P)] , 

cos a cos p = [cos (a-f P)-j-cos (a — P)], 


sin- a : 

For instance, 


1 — cos 2a 


and cos 2 a ■ 


1 -f cos 2a 


[ sin 2 3a-d.r= j - — da: a; — ^-sin6x+C 

We now mention one more interesting technique based on apply- 
ing complex functions of a real argument (see Sec. VIII. 6) for which 
all the integration formulas remain valid. Obviously, if we integrate 
such a function its real part and imaginary part will also be inte- 
grated, that is if / {x) — u (x) + iv (x) then 

^ f{x)dx = ^ (u (x) + iv (x)) dx= j u (x) dx -f- J iv (a:) dx = 

= J (Re / {x)) dx-\-i j (Im / (x)) dx 

Therefore Re ^ J / (x) dx^j — j (Re / (x)) dx and Im ^ | / (x) dx j = 
= j (Im f (x)) dx. For instance, this enables us to find the real 


integral 


^ e ax cos bxdx— ^ Re [e a ’V bx ] dx = Re j dx = 


Re- 


(a-fib):c 


C-f-i'6 


, r eax (cos ftx -|- i sin bx) ( a — ib) 

-j-L — tie -5 , ,0 


fl2-f 


+ C = e 


ax a cos bx + bsin bx 


a n - + b* 


by means of Euler’s formula (see Sec. VIII. 4). 

4. Integration by Parts. Unfortunately, there is no formula expres- 
sing the integral of a product of functions in terms of the integrals 
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of the factors. As we know, the derivative of an elementary function 
is always an elementary function. But the integral of an elementary 
function may not be an elementary function, and this fact is con- 
nected with the above property. (The notion of an elementary func- 
tion was introduced in Sec. 1.18.) For instance, we have the tabular 

integrals j* sin x dx and j dx but the integral ^ dx is not 

expressible in terms of elementary functions. Integrals of this type 
will be discussed in Sec. 11. 

But if we integrate both sides of formula (uv)' — u'v -j- uv' (see 


Sec. IV.4) 

we obtain 




uv— ^ u'vdx-\- 

^ uv' dx 


that is 

J uv' dx — uv — j 

u'vdx 

(13) 

or, which 

is the same, 




J udv = uv — 

^ vdu 

(14) 


Formula (13) or the equivalent formula (14) is called the formula 
of integration by parts. When applying formula (13) we factor 
the integrand into two factors, i.e. u and i/, and then differentiate 
the first factor and integrate the second. Hence, we pass to an inte- 
gral in which u' substitutes for u and v for v'. After such a transfor- 
mation we may arrive at a tabular integral or at an integral which 
is simpler than the original one. 

Take some examples. In calculating the integral J x 2 In x dx 

we see that it is advisable to differentiate In x because this yields 
a power function which is simpler than the logarithmic one. Of 
course, we must simultaneously integrate the other factor ( x 2 ) but 
this yields a power function again. Hence, putting u = In x and 

dv — x 2 dx we find u' = — , v = -s-, and thus we have 

X O 

f x 2 lnxdx = lnx -^ — f d (In x) = — In x — 

It should be noted that while computing v we did not put down the 
corresponding arbitrary constant, that is w r e did not write v = 

x 3 

— ~ + C, because for our aims it was sufficient to obtain a single 
function v. 
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Similarly, we often try to differentiate arc tan x and arc sin x 
because this yields simpler functions. 

When computing the integral ^ x 2 sin 3x dx we should diffe- 
rentiate the power function since this reduces the exponent by 
unity. Therefore, integrating by parts twice in this manner we arrive 
at a tabular integral. At the same time, the differentiation or inte- 
gration of the sine yields trigonometric functions and thus this 
factor is neither' simplified nor complicated. Hence, we have 


^ x 2 sin 3x dx = — ~ cos 3a: -f- J -g- cos 3a: *2 xdx 

(we have used the expressions u — x" and dv = sin 3z dx here, 

1 

i.e. du — 2 xdx and v — — -3- cos 3a:). Further, denoting u = x 

and dv = cos 3a: dx, i.e. du — dx and v = y sin 3a:, we finally 
obtain 

^ x- sin 3a: dx— — ^-cos3x + -|- sin 3a: — j sin 3a: dx j = 


= — |- x 2 cos 3a: x sin 3a: ■ 


27 


cos3a:-{-C 


It can happen that after integration by parts we obtain the ori- 
ginal integral on the right-hand side but with another coefficient. 
Then, combining similar terms we can compute the integral. For 

instance, we compute the integral j Y 1 — x 2 dx by introducing 

u — V l — a; 2 and dv — dx (i.e. du x dx and v =aA : 

\ l/l —*2 ) 

f Yi — x 2 dx — x Y 1 — .T 2 -f- f ■-,/—■ dx = xY 1 — a: 2 -f- 

j J 1 — x 2 

+ ( ^y ~— - dx = x Yi —x 2 — ^ Y 1 — x 2 dx-\- 

+ [ yi-p dx — x Y 1 — x 2 — • j 1^1 — x 2 dx + arc sin x 

Now, transposing the integral thus obtained to the left-hand 
side we receive 

2 | Yi — x 2 dx = x Y 1 — x 2 -fare sin x-\-C 

where C is a constant (because the expression j" ]/i — x 2 dx, an 

indefinite integral, which we have transposed is defined to within 
a constant addend). Consequently, 

j Yi — x 2 dx = ~ x Y 1 — a: 2 + -g- arc sin x -j-Cj 

Q 

where C t = — is an arbitrary constant. 


2ti—0i41 
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5. Integration by Change of Variable (by Substitution). Here we are 
going to describe one of the most widely spread methods of integral 
calculus based on the formula of differentiation of a composite 
function (Sec. IV.4). Suppose a function F ( x ) is an antiderivative 
of / ( x ), and let x depend in a certain manner on t, i.e. x = (p (t). 
Compute the derivative of F ( x ) with respect to t: 

[F (*)]'« = IF {x))’ x .x\ = / (x) q>' (t) - / [9 (*)] 9' (t) 

If we integrate both sides with respect to t we get 
F{x) + C= j f[cp(t) 1 9 ' (/) dt 
Therefore, by formula (1), we have 

[ j / (*) | T _ T(f) “ J / l«P (01 •?' (0 di (15) 


It is this formula that is the basic formula of integration by 
substitution (by change of variable). 

We have <p' ( t ) dt — dx, and the right-hand side of formula (15) 

can therefore be rewritten as j / (x) dx. But in the process of inte- 
gration we do not regard x as an independent variable but con- 
sider it to be dependent on t. 

Consequently, formula (15) can be interpreted as follows: any 
integration formula of the form 


j f (x) dx = F (x) + C 


(16) 


remains valid if we make an arbitrary substitution x = 9 (t) both 
in the right-hand side and in the element of integration. Any for- 
mula of the form (16) is invariant in this sense. 

For instance, substituting x = u 3 into formula (3) we obtain 

j cos u s d (u 3 ) — sin u 3 -j- C, that is j u 2 cos u 3 du = — sin u 3 -f C 


and the like. Of course, when we apply formula (15) to a practical 
problem we do not start from a tabular formula; on the contrary, 
we try to find a change of variable which reduces a given integral 
to a certain tabular integral. 

Let us consider some examples. To get rid of the radical in the 
integral j ~ \+ x dx we perform the change of variable x = t- and 
dx = 2 1 dt: 


J \^~x dx ~ j = 2 j dt = 

— 2 ( j dt — ^ ( t — arc tan t-\-C) = 2 (]A x — arc tan x) J ~C 
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Hence, after performing the substitution and integration we must 
make the reverse substitution, that is we must pass from t to x. 

We sometimes regard the right-hand side of formula (15) as given 
and apply the formula for computing the left-hand side, that is 
we make the -substitution (x) — u instead of x = 9 (t). For 

example, to find the integral | xe x " dx we take advantage of the 

fact that the element of integration can be simply expressed in 

terms of x- because x dx =-^d(x v ). Therefore, putting a : 2 =ji, 

2x dx — du we obtain 

j xe x ~dx = j e u -^du = ±-e u + C = ~e*"-\-C 

Substitutions of the form -i|? (a:) = qp ( t ) are also applied in some 
problems. 

We could compute integral (7) by means of the substitution 3a: = 
= t , 3 dx = dt: • 

t 

j cos 3 xdx— ^ cos 2 -^ = j cos t dt = sin t-\-C = sin 3z-f- C 

We can sometimes perform calculations of this type without putting 
down the change of variable explicitly; for instance, 

j cos3xdx= j cos3xlM = -i ^ cos3a:d (3a:) = -|-sin3a:-f C 

Here we have used the invariance of formula (3). 

Similarly, 

j tan xix= j -l„|oosxl+C 

In general, we have 

’\ J 7w dx= l J m"= in Wi+ c 

Further, we have 

I yifT d *=U x ‘‘+V~ i -TT d ^+ 1 )= ■ 

“T-^x ) -+c=V*+i+c 

2 

form a? integration formula can be P ut down in the general 

1 ’ 

f dx = 5 v w~ ] 5 d i ( x ) = = 2 VW) +c (is) 

1 v 2 


26 * 
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In particular, by means of formulas (17) and (18) and by tbe 
method of completing a square, we can compute the integrals of 
the form 


ax-\-b 


~Vpx 1 + qx + r 


dx and 


ax-\-b 


px 2 -{-qx~{-r 


dx 


which are widely encountered. 

For example, let us illustrate the computation of the integral 

I 2x 3 

— - dx. To do this we tahe into account that the 

y— 312 + 2*+ 1 

derivative of the radicand is equal to — 6x4-2= — 6 ^x — 


J V—3x 2 + 2x + l J 

= 1 y\ x ^T t dx + 1 

-_± f 

3 J y— 


2 r 

/ 1 \ 

1 X 77- 

1441-3 

L 

\ 3 1 

‘ ' 3 J 


y_3x2-|-2x+l 
3 


dx — 


y_3x2 + 2 x + l 
, 7 


dx = 


J 


dx 


y-3x2+2*4-i 3y§ j | 

n/f I 


— ^ + 2x 4- 1 


d\ 

X- 

4) 

i 


-1 

I 

X 

i ) 

3 > 

2 

i 


- — \V — 3x 2 4- 2x 4- 1 ■ 


= — y — 3x 2 -j- 2x 4- 1 


31/3 
7 


arc srn 


3y3 


— arc sin 


2_ 
3 

3x — 1 


•4 -C= 
VC 


(19) 


[see formula (12)]. 

The problem of integration is much more complicated than the 
problem of differentiation. The reader should exercise much in 
order to learn elementary methods of integration. 


§ 2. Standard Methods of Integration 

Here we shall present some classes of functions which can be 
integrated by means of certain standard methods. It should be 
noted that in some cases these standard methods may not be the 
simplest. It is often advisable to perform certain preliminary trans- 
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formations or to apply directly methods of § 1 in order to simplify 
calculations. But the reader will be able to find the simplest tech- 
nique leading to the desired result only after necessary experience 
in computing integrals will be acquired. 

6. Integration of Rational Functions. Rational functions are 
integrated' on the basis of the results of Sec. VIII. 10. As was shown, 
any rational function (rational fraction) can be represented in the 
form of a sum of an entire rational function (a polynomial), in case 
the fraction in question is improper, and partial rational fractions. 

A polynomial can be integrated termwise by means of the sim- 
plest methods [for instance, see example (11)]. Partial fractions 

of the form — — ^ - — - can also be integrated quite easily. 

(*— a) a 

For instance, if it is necessary to integrate function (VIII. 38) 
then, by (VIII. 39) and (VIII. 42), we obtain 

r * 3 — 2*4-3 , r r 3121 1 1 . 

J *(*-l)(*4-2) a J L 4 * + 9 (* — 1) 6 (* + 2)2 + 

+ 36(T+2)] dx= ~ T ln M+TrHz— 1 I + -0 (* + 2) + 


+ gg In j x + 2 | + const 

Hence, now we must consider partial fractions of the form 

Mx + N _ 4 < 0) ( 20 ) 

(x*+px+q) P “ ' ' 

We begin the integration with a simplification of the numerator. 
Namely, taking into account that (x 2 + px + q)' = 2 + —■ ) 

we replace x in the numerator by ^a; + +j — and then combine 

similar terms without removing the parentheses. After that we 
break up the integral into two integrals [as in calculating expres- 
sion (19)]. The first integral is of the form 

+ dx 

\ 2 / _1_ P d ( x 2 + px + q) 

(* 2 + p* + g)P 2 J (^2 + px+ 9 )P 

and we therefore find it immediately. The second integral is of the 

form J (* 2 +pa:+ 9 )P - To com P ute it we complete the square 
in the denominator which results in x 2 + px + q = < x + a ) 2 + b 

where a and b are constants. Now, if we put x + a = y we arrive 
at the integral 


J 


/f3 i {y 2 +b)$ dy 


( 21 ) 
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which can be easily found in the case p = 1 (how can we do it?). 
To find integral (21) for p = 2, 3, ... we shall deduce a recurrence 
formula which will enable us to pass from Ip to the simpler integral 
fp-i etc. The formula is obtained by means of integration by parts. 
We have 






( 6 +y2)-y2 

(y2 + 6)P 
dy 


dy = 


(y* + bf 


Here we put u = y, dv = — y ^ dy, i.e. du = dy , v= j 


yiy 

(>/ 2 +6) b 


1 f d (>J 2 + b ) _ -1 l 

2 J (y* + bf 2(P — 1) (j,2 + 6)P-i ‘ 


Hence, 

h - y 7 P-1 + 26 (P - 1) (r/ 2 + 6) P_1 _ 

If 1 1 J/ , 2P-3 r 

~ b J 2 (P- 1) (yt + 6) P - 1 - 26 (P _ 1) [y t + 6)3- 1 ^ 26 (P - 1) 


( 22 ) 


(let the reader verify all the calculations!). 

As we have already mentioned, formulas of this type are called 
recurrence formulas. Such formulas express an unknown quantity 
dependent on a number (this is the quantity Ip with the number |5 
in our case) in terms of similar quantities with lower numbers (this 
is the quantity /p_, with the number p — 1 in our case). These 
formulas “may notj yield the solution immediately but they enable 
us to obtain ( the solution after several successive reductions of the 
number. Thus, formula (22) expresses Ip in terms of Ip- 1 . If we 
repeatedly apply the formula to Ip- j, that is if we substitute P — 1 
for p into formula (22), we obtain the expression of Ip - 1 in terms of 
Ip - 2 etc. Finally, we arrive at the integral 7 t which is immediately 
found, as has already been indicated. 

It is worth noting that in the above calculations we did not use 
the fact that the trinomial in the denominator of expression (20) 
has imaginary roots. The procedure can therefore be applied to 
integrating a fraction of form (20) when the denominator has real 
roots without decomposing the fraction into two summands of the 

form ■ ' v . 

{x—ay 

Integral (21) for b > 0 can also be computed with the help of the 
substitution y = b tan t. This leads to an integral of a power 
of cos t which will be discussed later. 
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Hence, the integral of a rational fraction is always expressible 
in terms of elementary functions, and this can be achieved by means 
of the above standard methods. The elementary functions in terms 
of which an integral of this type is expressed are rational functions, 
the logarithmic function and the arc tangent. The most difficult 
thing in the integration is the factorization of the denominator in 
accordance with formula (VIII. 29). 

Methods of computing many integrals of other types which we 
are going to study here are essentially based on the transition from 
a given integral to an integral of a rational function by means of 
suitable substitutions. This is the so-called rationalization of the 
integral which reduces the computation to the above standard 
methods. 

7. Integration of Irrational Functions Involving Linear and 
Linear- Fractional Expressions. First we take an integral of the form 


Y ax -f- b) dx {n — 2,3, . . .) 


(23) 


where a and b are constants and R ( x , y) is a rational function of 
its two arguments x and y (see Sec. 1.17). The integrand is an irra- 
tional function here because it contains the radical. To rationalize 
the integral let us use the substitution 

ax + b = t n , a dx = nt n ~^ dt 

which yields 

j R (x, Y ax + b) dx= j R ( , tj nt n_1 dt 

The integrand in the last integral is a rational function (why?). 
Similarly, an integral of the form 

{ R (*» ¥ ax + b, Y ax+b; . . ,)-dx , ( n , m = 2, 3,4, . . .) (24) 

where R (x, y, z, . . .) is a rational function of its arguments 
Xt V* , 2 ’ : * • g° es int0 an integral of a rational function after the 
substitution ax + b = t p with p suitably chosen (how must we 
choose p m the general case?). 

For example; the substitution 2x + 3 = t\ 2 dx = 6 fi dt yields 


f dx 

f 3 tSdt „ f 

J ~\/2x+3—2f 2x + 3 

J J3_2i2-- J 


r p 
J t — 2 


dt 


Performing the division of t 3 by t~ 2 we find 


(3 


= i 2 +2( + 4- 


t—2 
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and hence we finally obtain 

f 7 dx zr = = 3 f (t 2 + 2t + 4-f -^) dt = 

J V 2x + 3 — 2T^2i + 3 J \ * — 2 > 

= Z 3 -l 3z 2 + 12z + 24 In | Z — 2 1 -}- C = y 2i + 3 + 3^ r 2i + 3 + 

+ 12X2^+3 + 241n|X2^ : f3-2|-h C • 

The rationalization of an integral of the form 

5 R (*. Klfl) *• ('• = 2.3. •••) (25) 

where i? (a:, z/) is a rational function is carried out by means of the 
substitution 

ax+b-cxt' + d-t", x = ^ 

Thus, integrals (23)-(25) in which R is a rational function of its 
arguments are always expressible in terms of elementary functions. 

8. Integration of Irrational Expressions Containing Quadratic 
Trinomials. Here we mean integrals of the form 

| R ( x , y ax 2 -f bx -r c) dx (26) 

where R (x, y) is a rational function of its arguments. Such an 
integral can also be expressed in terms of elementary functions in 
all cases. In computing these integrals we apply trigonometric sub- 
stitutions. In order to do this we first complete the square and pass 
to a new integral: 

j R ( x , y ax 2 -f- bx + c) dx = j R ( x , Y ± ( kx -j- 1) 2 ± oi") dx 

where k, l and m are constants. After that we use one of the 
following substitutions: 

hx-^-l — m tan t for the radical Y {kx 4- Z) 2 -f m 2 

kx + l = m sin t for the radical Y — {kx + 1) 2 + m 2 and 

kx-~-l = for the radical Y (kx-j-t ) 2 — m z 

(of course, we cannot have the case ]/ — {kx + Z) 2 — m 2 for real 
integrals). The substitutions enable us to extract the roots (check 
it up!) and thus we come to an integral of the form 

! R t (cos t , sin t) dt (27) 

where /Zj ( x , y) is another rational function of its arguments. In 
Sec. 10 we shall describe methods of computing an integral of form 
(27). 
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We sometimes use the hyperbolic substitutions 


kx + l — m sinh t, kx -f- l = m tanh t and 
kx + l = m cosh t 

There are certain direct methods that can be applied to computing 
integrals of form (26). For instance, we can often pass to an integral 
of the form 




P 7l {A 

"}/ <zx 2 -|-&x-j-c 


dx 


(28) 


where P n { x ) is a polynomial of the nth degree. Such an integral 
can be easily found with the help of the method of undetermined 
coefficients. Let us show that the integral can be represented in 
the form 


Qn-i (^) ax* -f- bx + c -f- K f —7 — o~,' ~ r == c~ 

J y ax 2 -ybx-\-c 


(29) 


where Q n - t (x) is a polynomial of the ( n — l)th degree and K is 
a constant. The last integral is readily found (see the end of Sec. 5). 
For definiteness, let n = 3. Equating (28) to (29) we obtain 


ax 3 -j- 0X 2 -|- y X -f- 6 
~\/ ax~-ybx-\-c 


dx= {Ax 2 -\-Bx-\-C)Y «£ 2 + ha;-f c+ 


+ k\- 7 =M= (30) 

J -f- bx -f- c 

where all the coefficients on the left-hand side are given and the 
coefficients A , B, C and K should be determined. In order to find 
them let us differentiate equality (30): 


ax 3 -j- Pa 2 yx -)- 8 
"j/ ax 2 -f- bx -f- c 


= (2Ax + B) Y ax 3 -j- bx -j- c + 


+ (Ax^ + Bx + C) 


2ax-\-b 

2 Y ax 2 -j -bx-\-c 


K 

~\/ ax 2 -j- bx -}- c ’ 


aar 5 -f- Pa: 2 + yx -f- 5 = (2 Ax + B) {ax 2 + bx + c) -j 


+ (^1 x 2 -\-Bx-\-C) (2 ax -j- b) -j- K 

Equating coefficients in equal powers of x, that is coefficients in 
x , ar, x 1 = x, x° = 1, we find, in succession: 


3aA — a ' 

Y bA -f- 2 aB — p 
2c4-f-|h5 + aC = Y 

cB-y~bC + K = 6 
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We have a += 0 and therefore A is easily found from the first 
equation. Substituting the value of A thus found into the second 
equation we determine B etc. Consequently, we thus can determine 
all the coefficients A, B, C and K which justifies formula (30). 
For instance, let it be necessary to compute the integral 

/= f VW - 2x + idx 
In order to do this we write 


[ l/2z 2 -2x+l dx= f If. zM ±L = dx = 

J 1 J ~\/ 2i 2 — 2x+ 1 

= (4a: +5) V 2x 2 — 2a; + 1 + ^ J y= dX 


2x 2 — 2x+l 


2x- -2x-\-l 
~\/2x 2 — 2x + l 

= 4 ]f2x*-~2x~-l + (Ax + B ) — .. te ? .- = = +- 7 =JL= 

V ' 2+2*2 _2x + l l/2i 2 — 2x + 1 

Hence, we have 

2a: 2 — 2a; + 1 = 4 (2a: 2 - 2a; -f- 1) + {Ax + B) {2x— 1) + £ 
and i therefore 


44 = 2 

— 34 + 25= — 2^ which yields 4 = 4- > 5 = — K = ~ 
A — 5+ 5 = 1 1 2 4 z 

Thus, we obtain 


r dx i ( 

' dx 1 | 

. d 

(*-x 


J \/ 2x 2 — 2x -j- 1 1/2 . 


Vi 

!*-t) 

*-+ 


-+g ln X ~'i + V ( z- y) +T + 


_ J_, l2x-l + 1/2*2— 2» + l| , „ . 
1/2 2 + 

— In i 2x — 1 + +2a; 2 — 2a: + 1 1 + Ci 
where C t = — — |ln2+5. 
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Finally, 

/= (^-i) VW=te+-l+j^l n \2x-l+V 2z*-2z+lj + 


• Cn, 



Of course, we can simply write C instead of C 2 in the final answer. 
The integral 


5 


dx 

(x — a) 71 "]/ ax 2 + fcx + c 


(n — 1, 2, . . .) 


(31) 


can be reduced to integral (28) by means of the substitution x — a = 
= .1, Hence, after the substitution has been carried out we can 

apply the method of undetermined coefficients. The method can 
also be applied to an integral of the form 


Q ( x ) 


P ( x ) ~\/ ax 2 -}-bx-{-c 


dx 


(32) 


where P (x) and Q (x) are polynomials. Indeed, if we decompose 
the fraction into an entire part and a sum of partial rational 

p (x) 

fractions of type — [see formula (VIII. 37)] then integral (32) 

(x — a) v 

breaks into a sum of integrals of forms (28) and (31). 

9. Integrals of Binomial Differentials. A binomial differential 
is an expression of the form 

(ax n -f- b) v x m dx 

which enters into an integral of the form 


j (ax" + 6) p 


x m dx 


that we are going to consider here. The numbers n, p and m in the 
expression are rational numbers, that is they are integers or rational 
fractional numbers. In 1730 Christian Goldbach (1690-1764), a Rus- 
sian mathematician, indicated three cases when the integral of 
a binomial differential can be expressed in terms of elementary 
functions: 

1. The number p is an integer. If p > 0 we should simply remove 
the brackets and perform termwise integration. If p ■< 0 we must 
make the substitution x = t k and choose k in such a way that all 
the exponents become integers. This being always possible, 
we thus get an integral of a rational function. 
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2. The number is an integer. Substitute ax n + b = u. 

Then we get 

m4-t 

-=-4xi f («— &) n 

The exponent - m J 1 — 1 being an integer, we thus arrive at pre- 
ceding case 1. 

3. The number m -^ - L -f- p is an integer. Here we use the sub- 
stitution ax n + b = ux n which again yields case 1. We leave the 
calculations to the reader. 

It was only in 1853 that the prominent Russian mathematician 
P. L. Chebyshev (1821-1894) proved that the integral of a binomial 
differential cannot be expressed in terms of elementary functions 
(see Sec. 11) except for the three cases enumerated above. 

10. Integration of Functions Rationally Involving Trigonometric 
Functions. Here we shall deal with an integral of the form 



j R (sin x, cos x) dx (33) 

where R ( u , v) is a rational function in u and v. Such an integral 
is always expressible in terms of elementary functions. To prove 

this let us make the so-called universal substitution tan = t. 
Then we have 


2tan|- 

sin x= 

1 + tan 2 y 


2 1 

1-f t 2 ’ 


cos a: 


1— «2 
l+«2’ 


dx — 


2dt 
l + I 2 


(34) 


(verify the calculations!). Hence, integral (33) reduces to the inte- 
gral 

Wif*. £5) I h* 

where the integrand is a rational function of t. The last integral can 
be found by means of the method of Sec. 6. 

The universal substitution (34) often leads to very complicated 
expressions containing rational fractions and it is therefore pre- 
ferable to avoid it in problem-solving practice. In certain particular 
cases it is better to use some other substitutions which we are going 
to consider here. 
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1. Let tlie integrand in integral (33) be an odd function with 
respect to sin x, that is let R ( — sin x, cos x) = R (sin x , cos x). 
Then we can write 

/= \ R (sin x, cos x) dx = J s -~ - sin £ dx = 

a= J R i ( sinx, cos x) sin x dx 

where R t is an even function with respect to sin x. R { being a ratio- 
nal function, we can easily express it in terms of sin 2 x and cos x. 
It follows that 

/ = j R 2 (sin 2 x, cos x) sin x dx = — j R 2 (1 — cos 2 x, cos x) d cos x 

and therefore if we put cos x — t we arrive at an integral of a ratio- 
nal fimction. 

2. Similarly, if the integrand in (33) is an odd f unction with 
respect to cos x then the substitution sin x — t rationalizes the 
integral. 

For example, 

{ sin 2 x dx _ f sin 2 x cos x dx 
cos 3 x — 2 sin x cos x — J cos 2 x (cos 2 x— 2 sin x) 

Putting sin x — t, cos xdx = dt we derive 

F sin 2 x dx r t 2 dt 

J cos 3 x— 2 sin x cos x J (1 — t 2 ) (1 — t 2 — 2t) 

The last integral is readily found if we decompose the integrand 
into partial fractions or if we take advantage of the equality 

i 2 = i [(1 - t 2 ) - (1 -t 2 - 2t)] 2 

3. If the integrand does not change its value when we simulta- 
neously change the signs of sin x and cos x, that is if 

R (—sin x, —cos x) = R (sin x, cos x) 

then we can apply the substitution tan x = t (or cot x — t). We 
can easily verify that this yields the rationalization of the integral 
in the general case but we are not going to do this here because 
in every concrete example the advisability of the substitution is 
confirmed by the results of the calculations. 

Take an example: 

f dx _ T dx 

J sin 2 x cos 4 x ~ J tan 2 x cos 6 x 
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1 rfx 

Putting tan x = t, cos 2 x=-j-j-p , - = dt, we complete the inte- 

gration: 

5 iTn-Zcos^^ S { ^~ Ldt ^ J ( t2 + 2 +i) dt = 

- - L _ -\-2t — r- + C = tan 3 x -f- 2 tan a: — cot a: -f (7 

o £ 

The same integral can be computed if we represent it as 
f (sin 2 cos 2 a ;) 2 , 

J sin 2 * cos** 

and remove the brackets in the numerator (check it up!). 

Let us separately consider integrals of the form 

| sin m a: cos n a; da; (35)' 


where m and n are arbitrary integers of any sign. In case m is odd 
the integral belongs to case 1 considered above, and thus it can be- 
found by means of the substitution cos x = t. If n is odd the inte- 
gral belongs to case 2. Finally, if both m and n are even we have 
case 3. But the calculations can sometimes be simplified. For in- 
stance, if m >- 0 and n ^ 0 and if both m and n are even we can 
apply the formulas 


sin 2 x - 


1 — cos 2x 


sin x cos x = -j sin 2a: 


and cos 2 x — 


1 -J- cos 2x 


For example, 

J sin 2 x cos B x dx — | (sin x cos a:) 2 cos 4 x dx = 

= £ sin 2 2x (1 + cos 2a;) 2 dx — 

= j sin 2 2a: da: -f -i- j sin 2 2a: cos 2a: dx + ^ | sin 2 2x cos 2 2a: dx = 
= ^ f — cos /iX ) dx + ^ j sin 2 d (sin 2a;) + ^ | sin 2 4a; dx — 

| (1 — cos 8x) dx — 


1 ( 

I T_ 

sin Ax \ 

. 1 sin 3 2x . 

L J_ f 

32 \ 

4 ) 

1 16 3 

1 128 J 


~128 x ~ 


11 1 
■ sin 4a: -}- sin 3 2a: — sin 8x + C 


The same result can be obtained if we express the trigonometric 
functions in terms of exponential functions by using Euler’s for- 
mulas (VIII.ll). 

We sometimes perform integration by parts in computing inte- 
grals (35) in order to reduce the positive exponents and increase 
the negative exponents in powers of sin x and cos x. For example, 
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we can put cos x = u 
du = —sin x dx and v = — 
tion 


and 
1 ■ 


2 sin 2 x , 


dv = 
when 


cos x 


dx 


^that 


is 


COS" a: 
sin 3 ! 


f C -^ dx = 
J sin 3 a: 


COS! 


2 sin 2 ! 


i 


sin 3 x 

integrating the func- 

dx 


2 sin! 


Now, making the change of variable tan y = i we obtain 


$ 


COS- X 


sir x 


dx = 


cosx 


cosx 


2 sin 2 x 
ln|<| + C= - 


if 

cos x 


2 dt (1-M 2 ) -1 _ 
2t (1 -f-i 2 ) -1 


1 


2 sin 2 x 


In 


tan 


+ C 


2 sin 2 x 2 

there we have utilized formula (34) in transforming the second 
integral]. 

11. General Remarks. Since integration is a much more compli- 
cated procedure compared to differentiation the reader must care- 
fully study the basic methods of integration. But, on the other hand, 
it is inexpedient to carry out complicated calculations every time 
when it is necessary to compute an integral. It is therefore advisable 
to use reference books in which the most widely encountered inte- 
grals are collected in orderly way. In particular, we refer the reader 
to [7], [19] and [46]. 

Many important integrals are not elementary functions, that is 
they cannot be expressed in terms of finite combinations of the 
simplest elementary functions which are studied in elementary 
mathematical courses. For instance, the integral 

^ ^ x 2 -f 1 dx = j" (x- -j- 1)3 dx 

belongs to the type considered in Sec. 9. But since here we have 


n 


2, p = -j and m = 0 the integral cannot he reduced to those 
three cases we studied in Sec. 9. Similarly, the integrals 
\ sina:*a: a d;r 1 

J I 

| e ±x -x a dx | (a =^=0, 1, 2, . . .) 
j cos x-xPdx j 

are not expressible in terms of elementary functions, and therefore 
all the integrals that can be reduced to these integrals cannot be 
expressed in terms of elementary functions either. Examples of 
such integrals are 


j e * 2 dx = -j j e~ u u 2 du 
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(we have applied the substitution x 2 = u, dx = - here j , 


<*=*“> 


j sin a: 2 dx = — j sin u • u 2 du (x 2 = n) 


and 


J cos x 2 dx = ~ j cos u-u 2 du ( x 2 = u ) 


^ Y sin xdx— j « 2 (1 — u 2 ) 2 du 

(the last transformation is carried out by means of the change of 

du 


variable sin x = u, dx = 


: ; this results in the integral on 


the right-hand side which belongs to the type considered in Sec. 9 
for 7i = 2, p = — y and m = yj . 

There are wide classes of such non-elementary integrals. For 
instance, as a rule, integrals of the form 


j R (x, Y P n (x)) dx (37) 

where R, as before, is the sign of a rational function and P„ (z) 
is a polynomial of degree n 3 are not expressible in terms of 
elementary functions. 

In the past the fact that certain integrals cannot be reduced to 
elementary functions was thought of as a catastrophe. But now we 
can easily overcome such difficulties. First of all, there are extensive 
tables of many important non-elementary functions in terms of which 
very many integrals that cannot be reduced to elementary functions 
are expressed. In Sec. XIV.12 we shall give examples of such non- 
elementary functions (special functions) to which all the integrals 
of form (36) can be reduced. Integral (37) for n = 3 and n — 4 is 
called an elliptic integral. Such an integral can be expressed in 
terms of the so-called elliptic functions which are thoroughly inve- 
stigated. These methods and formulas can be found in the reference 
books we mentioned above. We also refer the reader to 123]. 

Besides, at present the techniques of computing integrals have 
become so perfect that the investigation of a function represented 
in the form of an integral is not more difficult than the investigation 
of a function represented directly, without the integral signs. There- 
fore now even when we encounter an integral which is expressible 
in terms of elementary functions but which involves very compli- 
cated combinations of them we usually prefer to deal with the inte- 
gral representation of the function. 


CHAPTER XIV 


Definite Integral 


In solving many important problems we have to sum up an infinite 
number of infinitesimal summands. This leads to one of the basic 
concepts of mathematics, namely to that of the definite integral. 
Essentially, it is this concept to which all the methods of integration 
presented in Chapter XIII are applied. 

§ 1. Definition and Basic Properties 

I. Examples Leading to the Concept of Definite Integral. Let us 
consider a problem which is reverse to the one considered in the 
end of Sec. IY.l that led us to the concept of a derivative. Namely, 
let us regard the law of variation of the instantaneous velocity of 
a material point v = v (t) as known and calculate the path 
length covered during a period of time from t — a to t = (5. 

Since we do not suppose that the motion is uniform we cannot 
compute the path as the product of the velocity by the time taken. 
We shall therefore apply the following procedure. Let us divide 
the whole time interval into a large number of small subintervals 
of time which may not be equal to each other: 

to — « ^ t <1 t lt ti t t 2 , . . ., t n - j t <; t n = p 

where t u . . ., f n _ t are some intermediate instances of time which 
are chosen arbitrarily. If these subintervals are sufficiently small 
we can regard the motion as being uniform during each of the subin- 
tervals without making a considerable error. Hence we can put 
down the following approximate expression of the path: 

S UjA t± -{- OoA -}-••• "T" U n Af n (!) 

Here v k ( k = 1, 2, .... n) is one of the values of the instantaneous 
velocity v attained on the kth subinterval of time, i.e. v k = v (r ft ) 
where t k -i ^ ^ f*. and At k = ti, — i is the length of the 

subinterval. [The reader should pay attention to the difference 

27-0141 
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where each of the points t h is arbitrarily chosen between x h -i and 
x h , that is somewhere on the leth subinterval, and Axk = x h — x h -i. 
Now let the lengths Axk be infinitely decreased; then the limit to 
which the integral sum tends in this process is called the definite 
integral of the function / ( x ) taken over the interval of integration 
a ^ x ^ p. The definite integral is denoted as 

P n 

( / ( X ) dx= lim 2 / (Ifc) Aa* . (7) 

<2 k—\ 

Accordingly, in the examples considered in Sec. 1 we obtain, 

respectively, 

BP P P 

s = ^ v ( t ) dt, V = ^ w (t) dt, M — ^ p (s) ds and S = j / [x)dx (8) 

a a a a 

The last equality implies the geometric meaning of the definite 
integral in the case when the function y — f (x) (the integrand) is 



positive: in this case the integral is equal to the area of the curvi- 
linear trapezoid hounded by the graph of the function, the axis of 
abscissas and the straight lines parallel to the axis of ordinates 
passing through the end-points of the interval of integration. The 
end-points, that is the] numbers a and p, are called, respectively, 
the lower limit and the upper limit of integration. The expression 
/ (x) dx is called the element of integration. 

If the integrand is negative or changes its sign then some terms 
entering into integral sum (6) will be negative. Therefore after 
passing to the limit we see that the integral is equal to the algebraic 
sum of the areas of the parts of the curvilinear trapezoid which lie 
over and under the z-axis (see Fig. 245). The areas of the parts 
lying over the a:-axis are taken with the sign + and those under the 
x-axis are taken with the sign — . 

Comparing formulas (8) we can also conclude that in order to 
calculate the path length covered by a point in its rectilinpp T ' 
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motion, for a given relationship between the velocity and the time 
taken which is represented by a graph (see Fig. 246), it is sufficient 
to compute the area of the corresponding curvilinear trapezoid. 
In this example we must also take the area with the sign — if v < 0, 
that is if the graph lies under the 2-axis, because the increment of 
the coordinate of the moving point is negative in such a case. This 
rule of signs for computing an area holds for a great number of 
other examples. 

Let us dwell in more detail on the passage to the limit in for- 
mula (7). The limit is sometimes said to be taken as n oo but 
this is not precise because we do not suppose that the subintervals 
Ax h are of the same length and therefore if we limit ourselves to 



the condition that only n oo then we can encounter a case when 
the subintervals belonging to one part of the interval a ^ x ^ (5 
decrease whereas the others do not. It is therefore better to say that 
the limit is taken while the lengths of the subintervals are infinitely 
decreased. The degree of the decrease can be characterized by the 
largest of the lengths Ax h of a given partition because if the largest 
length is small then the other lengths are automatically small. 
Hence, we can say that the passage to the limit in formula (7) is 
performed in a process in which max Ax k -> 0. 

h 

Let us consider an example of calculating a definite integral on 
the basis ^of its definition (7). Let it be necessary to compute the 

integral] j" x“ dx. Divide the interval of integration into five equal 
o 

parts of length 0.2. For definiteness, let us choose a point on each 
of the subintervals at its left end-point. Then = 0.0, = 02 

?3 = 0.4, £ 4 = 0.6, Hj = 0.8 and 
1 5 

j x 2 dx ^ 2 ^ a ^ = ( 0 -° 2 + 0.2 2 -}-0.4 2 + 0.6 2 + 0.8 2 ). 0.2 = 0.24 

0 h=l 
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An analogous division into 10 subintervals would yield the value 
0.29 and the division into 100 equal subintervals would yield the 
value 0.33. (In Sec. 3 we shall find the exact value of the integral 

which turns out to he equal to ; thus the above value 0.33 is very 

close to the exact value. We suggest that the reader should elucidate 
the fact that the approximate values obtained in our example are 
smaller than the exact value, by means of a graphical construc- 
tion.) 

We see that there is arbitrariness in forming an integral 
sum because it depends both on the choice of the points of division 
x h and on the choice of intermediate points g ft . But nevertheless 
if we take a partition whose subintervals are sufficiently small the 
sum will be practically equal to its limit, that is to integral (7) 
which, of course, depends neither on the points x & nor on the points 
Is. Each of the summands entering into an integral sum becbmes 
very small when we take a partition in which Ax h are sufficiently 
small, the smallness of the summands being implied by the small- 
ness of Ax h . But at the same time the number of summands becomes 
so large that the whole sum has a finite value. Roughly speaking, 
if the number of summands entering into an integral sum is equal 
to n then each Ax h (and therefore each of the summands) is of the 

1 1 

order of — whereas the whole sum is of the order of n — = 1, i.e. 

n n 

it is finite. Thus, taking into account that we pass to the limit in 
formula (7) we can say that the definite integral is a sum of infinitely 
many infinitesimal summands. Practically, we can often regard 
a definite integral as a sum of a great number of very small homoge- 
neous summands (that is the summands of the same dimension, of 
the same character, of the same sense etc.), the summands being 
so small that the sum is practically equal to its limit. Such an 
approach completely corresponds to the practical concept of infinite- 
ly large and infinitely small quantities which are understood (see 
Secs. III.l and III. 3) as quantities that are, respectively, very large 
and very small but finite, theoretically. It should be noted that 
not every sum of infinite number of infinitesimal summands yields 
an integral. Indeed, as we have seen, for such a sum to assume a finite 
value, the number and the magnitude of the summands should be 
coherent in a certain sense. 

The interpretation of the integral as a sum accounts for its nota- 
tion. In fact, if we regard the summands entering into sum (6) as 
infinitesimals and denote the lengths of the subintervals as Ax h = 
= dx then the whole sum (6) can be put down in the form 

(to *=P) 

2 / i x ) dx 

( from x=n.) 



DEFINITE INTEGRAL , 423 

V 

At first sums were denoted by the letter S. Then it was gradually 
lengthened and this resulted in modern notation (7). 

In conclusion note that an integrand can be either a continuous 
function on the interval of integration or a discontinuous one, that 
is having points of discontinuity. But in § 1 we impose the condi- 
tion that the interval of integration is finite and that the 
integrand does not approach infinity on the interval. It can be 
shown that under these assumptions (and under some additional 
requirements) the definite integral exists, that is it has a finite 
value. A rigorous proof of this assertion which is not based on phy- 
sical or geometric considerations can be found in more comprehen- 
sive courses on mathematical analysis. As we shall show in § 4, 
the integral may not have a certain numerical value if the above 
conditions are violated. 

3. Relationship Between Definite Integral and Indefinite Integral. 
We begin with a simple remark that a definite integral does not 
depend on the notation of the variable of integration, i.e. 

P P P 

j / (x)dx= \f ( t ) dt = j / (s) ds — . . . (9) 

a a a 

Indeed, for example, this is implied by the fact that all the inte- 
grals put down above are equal to the same area. Thus, the variable 
of integration in a definite integral is a dummy variable similar 
to an index of summation (see Sec. III. 6) and it can be denoted by 
any letter or symbol. 

Let f ( x ) be a function that we are going to integrate. But let 
the lower limit alone (denoted as x 0 ) be fixed, and the upper limit 
(which we denote by x) be arbitrary, i.e. variable. Then the value 
of the integral itself will depend on x , and we can therefore denote 
it as © ( x ). Hence we can write 

-”C 

® (x) = J / (,t) dx (x 0 — const) 

•■'o 

or. taking into account equalities (9), 

® ( x ) — j / W dt (x 0 = const) (10) 

The first form of writing may sometimes lead to misunderstandings 
because the letter x entering into it is simultaneously understood 
m two different senses, namely as the variable of integration and 
as the upper limit. Therefore, although the first form is admissible, 
tne second is preferable. 
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Now let us prove that the function CP (x) thus constructed is an 
antiderivative (see Sec. XIII.l) of the integrand / ix), that is 

X X 

■ £ \nt) dt =[\f{t) dt )' x =f{x) 

3:0 * 0 

Thus, we assert that the derivative of a definite integral with 
respect to the -upper limit is equal to the value of the integrand for 



Fig. 247 


the value of its argument equal to the upper limit. To prove the 
assertion we first suppose that / ( x ) is a continuous function and 
consider Fig. 247. The geometric meaning of the integral implies 

that if x gains an increment Ax 
1 / 1 then A® is equal to the area 



shaded in Fig. 247. This area is 
approximately equal to the pro- 
duct Ax •/* where f* is an inter- 
mediate ordinate equal to one 
of the values of the function 
taken between the points x and 

x -f Ax. It follows that — = 

LAX 

— f* — f ( x *)- Hence, if Ax->- 0 
then x* — »- x, and therefore pas- 
sing to the limit we obtain 

O' (x) = lim -^5- = lim / ( x *) = 

A.t-,0 A.t-0 

= /(*) 


lug. 248 which iswhat we setout to prove. 

In particular, we see that a 
continuous junction always has an antiderivative (see Sec. XIII.l). 
To find one of the antiderivatives we can evaluate the definite 


integral of the given function for a fixed lower limit and regard it 
as a function of its upper limit. 
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Now suppose that the integrand is discontinuous (but finite since 
for the time being we consider only finite functions). Then function 
(10) is continuous at the points of discontinuity of f (x) but the 
derivative of ® ( x ) has jump discontinuities at these points. Hence, 
the graph of ® (re) is “broken” at such points (see Fig. 248). We 
extend the notion of an antiderivative when we admit these “breaks” 
because at a point of this kind there is no single value of the deri- 
vative. But this natural extension enables us to say that each func- 
tion that is finite everywhere has an antiderivative which is a con- 
tinuous function. 

Now suppose that we have to evaluate the integral 

P 

/ = j / ( x ) dx 
a 

and that we know one of the antiderivatives of the function / ( x } 

X 

which we designate as F (x). The function j / (t) dt also being an 

a 

antiderivative of / (rr), we have, by Sec. XIII. 1, the relation 

| / (t)dt — F ( x ) -f- C 

a 

where C is a constant. Putting here x — a we see that the geometric 
meaning of the integral implies that the left-hand side of the relation 
vanishes, that is we have ' 

X 

0 == F (a) -f C, C — — F (a) and j / (t) dt = F (x) — F (a) 

a 

Putting x = p in the last formula we obtain, on the basis of 
relation (9), the formula 

IP 

j /.(*) dx = F (P) — F («) (11) 

a 

Thus, a definite integral is equal to the increment of an antideri- 
vative of the integrand corresponding to the variation of the inde- 
pendent variable from the lower limit of integration to the upper 
limit. The right-hand side of formula (11) is also designated as 

F ( x ) la where is the sign of double substitution which means that 
the lower limit and the upper limit must be substituted for the 
argument into the function and then the first result subtracted from 
the second. 
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Another way of writing formula (11) is 
3 

| / (x) dx = ( [ f(x)dx S j 

a 


(12) 


Formula (12) is justified by the fact that 

( f f(x)dx] | B =(F(*) + C) *=[F(P) + C]-[F(a) + C] = 
v j / }a a 

3 

= F (P) — A (a) = j / (x) dx 


where F (x) is one of the antiderivatives and C is a constant (see 
Sec. XIII.l). 

Thus, a definite integral is equal to the increment of the corres- 
ponding indefinite integral. This result is one of the most important 
theorems in mathematics. It is called the Newton-Leibniz theorem. 

Let us take an example: 


\ x-dx=( 

\ x 2 dx) '—(£-) 

r- 

J \ 

0 

J / 0 \ 3 / 

10 


1 

3 


It should he noted that when evaluating the indefinite integral here 
we have not written the arbitrary constant C because, as it was 
shown, the terms +C and — C always cancel out. 

We see that if the limits of integration are given a definite integral 
is a constant number whereas the corresponding indefinite integral 
is a function. 

Up to now we assumed that a < p. Let us extend formula (11) 
to the case a p. This means that for a^-P we regard formula 
(11) as the definition of the integral written on the left-hand side. 

Since / (x) = F' (x) formula (11) can be rewritten as 

ft* 

j F' (x) dx = F ($) — F (a) (13) 

• a 

Hence, the definite integral of a derivative is equal to the incre- 
ment of the antiderivative. 

4. Basic Properties of Definite Integral. 

1. The interchange of the limits of integration yields the multi- 
plication of the integral by — 1. Actually, by formula (11), we have 


a 3 

f(x)dz = F(a)~F@)= -[F(fi)-F( a)] = - J f(x)dx 

3 a 
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This simple property can also he put down in the form F ( x ) |£ = 
— _ f (x) jp which enables us to substitute the limits in reverse 
order if we change the sign of the indefinite integral beforehand. 
For instance, 



1 

5 

1 

1 

2 


X 

3 ~ 

" 3 

5 

15 


3 

In particular, property 1 implies the following rule of differen 
tiating an integral with respect to its lower limit: 

X Q XQ X 

4r ( I f w <«) = ( j / (') *) 1= —E ( J /«*)=■—/ w 

X x x 0 

2. If the limits of integration coincide then the integral is equal 
to zero, i.e. 

a 

j f (x) dx = 0 

a 

Property 2 has been already used (see Sec. 3). 

3. The theorem on “partition of the interval of integration”: 


P v v 

J / (x) dx + j / (x) dx= j / (x) dx 


<* P a 


for any a, P and y. In fact, the left-hand side is equal to 


IF (P) - F (a)] + [F (v) - F (p)J = F (y) — F (a) 

y 

which equals j / (x) dx. 

a 

4. The integral of a sum of functions is equal to the sum of the 
integrals of the summands (the same is true for the difference) 


p P P 

j [/ (x) ± cp (x)] dx = j / (x) dx ± j cp (x) dx 

a a a 

To prove the property we apply the analogous property of indefinite 
integrals (see Sec. XIII. 9) and equate the increment of the right- 
hand side to the increment of the left-hand side as x varies from a 
to_p. The following property is proved in a similar way. 

5. A constant factor can be taken outside the sign of the integral: 
P P 

j Mf (x) dx = M j / (x) dx (M = const) 

a a 
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Properties 4 and 5 can be formulated simultaneously as “a definite 
integral is linear with respect to the integrand”. 

The term “linear” is understood here in the sense of Sec. XI. 6. 
Namely, the formula 

J f (x) dx = I (14) 

a 

for fixed a and [3 determines a correspondence between the finite 
(integrable) functions defined over a ^ x ^ [3 and real numbers I, 
that is to each function there corresponds a certain number /. 
In other words, formula (14) defines a mapping of the infinite-dimen- 
sional linear space of such functions into the one-dimensional space 
of all real numbers. Properties 4 and 5 are then nothing but the 
condition that the mapping is linear. (For instance, let the reader 

verify that, for a — 1 and (3 = 2, the number / = -h corresponds 
to the function y = x 2 and the number -g- corresponds to the func- 
tion y = — whereas the number 5- -5 — 3--^ = 10.54 corresponds 

X 6 OO 

3 

to the function y — 5x 2 — — .) A rule, a law, according to which 

to functions there correspond numbers, is called a functional. 
Hence, formula (14) determines a linear functional defined over the 
above functional space. 

6. The formula of integration by parts 

P P 

| uv' dx — (uv) j — | u'v dx 

a [a 

is also deduced from the corresponding formula for indefinite inte- 
grals, namely from formula (XIII. 13). 

Take an instance: 


n it 

j xsinxdx=( — xcosx)l’^-{- ^ cos xdx = 

0 a 

= ( — x cos x) j? -J- (sin x) |J = ji 

(here we have put u =x, dv — sin x dx, du — dx and v = — cos x). 
. 7. The formula of integration by change of variable in definite 
integrals is obtained if we equate the increments of both sides of 
formula (XIII. 15) corresponding to the variation of t from a to (3. 
Doing this and taking into account that the variable x which equals 
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<p ( t ) varies from (p (a) to <p (P), we obtain 


J / If (01 ?' <*> * ■ = [ 1 1 (*) ix \ U, - [ J ■**] 


x=<p(cc) 


Now, taking advantage of formula (12), we finally derive 



dx 


Hence, in applying the formula we should additionally change the 
limits of integration. To do this we must find an interval which 
should be run by the new variable so that the old variable of inte- 
gration should vary over the interval that was originally set for it. 

For example, if we want to make the substitution x — R sin t 
in order to evaluate the integral 


R 


j Y R--x- dx 
u 


we must take into account that for x to vary from 0 to R , it is suf- 
ficient that t should run from 0 to ~ . Therefore 


s 



Y R~ — x~ dx = ^ V i? 2 — R' 1 sin 2 1 R cos tdt = 


= R" j cos 2 1 dt = j (1 — cos 2 1) dt = 

0 0 

JX_ 

_ R 2 I, , sin 2( \ - nR 2 
2 V 1 2 ) o “ 4 

As we see. in contrast to the change of variable in an indefinite 
integral, the inverse substitution, that is the transition from the 
new variable to the old one in the final answer, is not needed in 
computing a definite integral. We suggest that the reader should 
construct a geometric figure whose area is expressed by the above 
integral and, in addition, obtain the same result by means of the 
substitution x — R cos t. 

We have deduced properties 3-5 of the definite integral on the basis 
of formula (11). But the same could have been achieved with the 
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help of definition (7) which introduces the definite integral as the 
limit of the corresponding integral sum. For instance, passing to 
the limit in the formula 

n n n 

S [/ (SO ± <p (SOI Ax ft = 2 / (Ik) Ax k ± S cp (|„) Ax,. 

k=i h=l fc=i 

as the subintervals of the partitions of the interval a ^ x p 
tend to zero, we receive property 4 etc. Property 8 is implied by 
the same definition. 

8. If the variables in question have certain dimensions then 

P 

[ jj / (x) da ;] = [/] • [x] 

a 

since the operation of summation and the operation of passing to 
a limit do not change dimensions. 



9. There are certain cases when the integration in symmetric 
limits can be simplified. Namely, as we see in Fig. 249. we have 

a a 

| / (x) dx = 2 ^ / (x) dx 

—a 0 

for every even function / (x), and we have 

a 

^ f(x)dx= 0 

—a 

for every odd function / (x). 

10. An integral of a periodic function taken over an interval 
whose length is equal to the period of the function does not depend 
on the position of the interval on the axis of the variable of integration. 
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In other words, if / (sc + A) = / (a) then the integral 

a:+A 

/= ( /(s)ds 

a: 

is independent of x. Indeed, .according to the rule of differentiating 
a composite function, and on the hasis of the formulas for the deri- 
vatives of an integral with respect to its lower and upper limits, 
we get 

(Let the reader prove the property by taking advantage of the geo- 
metric meaning of the definite integral.) 

In conclusion, let us consider several examples of incorrect eva- 
luation of definite integrals. 

l 2n 

1. j Yi— x-dx=- ^ Yl — cos 2 Z( — sin?) dt — 

-1 

2jc 2jt 

r . o, j, f 1— cos2t J. 

= — \ sin - tat — — \ g dt = 

=-K'-¥)ir=-T 

(we have performed the substitution x — cos t , jt ^ t ^ 2jt). 
The result is apparently incorrect since an integral of a positive 
function taken in the positive direction (that is from a smaller 
limit to a larg er one) must be positive. The mistake lies in the re- 
placement of ]/ sin 2 t by sin t whereas it should have been replaced 
by | sin t | (see the end of Sec. 1.5). Actually, we have sin t < 0 
for n < t < 2 n and hence Y sin 2 t = —sin t for such t. If we 
took ]Asin 2 t — | sin t | we should obtain the correct result 

f/l — x 2 dx — y . 

- 1 

We sometimes encounter integrals of the form 

b 

j I / (*) | dx 

a 

in analogous situations. These integrals can be treated in the follow- 
ing way. We begin with determining the intervals of retention of 
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the sign of the function / ( x ) (see Sec. 111.15). For instance, let 
} (x) > 0 for a < x < c, / (x) < 0 for c < x < d and / (a:) > 0 
for d < x < h. Then 


b c d b 

j | / (x) | dx = J |/(x)|dx-f ^ |/(x)|dx+ | |/(x)|dx = 

a a c d 

c d b 

= | / (a;) dx — j / (x) dx + J / (x) dx 

a c d 


etc. 

2 

2. The correct value of the integral I = l x 2 dx is 3 which is 

-l 

obtained without any substitution: 


,r3 2 

x 2 dx — — 


23 


(-1)3 


- 1 


But the following calculations are incorrect, and their result con- 
tradicts the correct value 1 = 3: 


] x2dx= 5 t JVT dt= T I ^=44 


(we have made the change x 2 = t, i.e. x = Y t). The mistake lies 
in the fact that the formula x = Y t of transition from x to t makes 
no sense for x < 0. Therefore, if there are reasons that make it 
necessary to perform the substitution x 2 = t, we must break the 

2 

integral into two summands according to the formula ^ x 2 dx = 

-i 


= j x 1 dx -j- [ x 2 dx and then put x = — Yt (1 t ^ 0) in 


the 


-l 


first integral and x = Y £ (0 ^ t ^ 4) in the second integral (let 
the reader do it!). 

3. As in example 1, the following result is apparently incorrect: 


j 



2 

-1 


2- 1 -(-l)- 1 
— 1 


3_ 

2 


Indeed, the integrand approaches infinity at x = 0, and we cannot 
therefore apply the Newton-Leibniz formula here. We shall discuss 
integrals of this type in Sec. 16. 
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5. Integrating Inequalities. The definition of the integral and its 
geometric meaning (see Sec. 2) imply that 

P 

if /(x)> 0 and a<p then j/(x)dx>0 (15) 

a 

The last inequality turns into the equality if and only if f (x) == 0 
on the interval a ^ x sCl p in the case of a continuous function 
/ (x). But if we consider discontinuous integrable functions as well 
then the integral of a function which is different from zero at a finite 
number of discrete points is nevertheless equal to zero because such 
points do not affect the value of the integral. 

If there is a condition 

qp (x) ^ (x) for a ^ x ^ P (16) 

then putting 9 (x) — qp (x) = f (x) and taking advantage of assertion 
(15) we deduce 

P P P 

^ I 1 ! 5 (x) — cp (x)Jdx>-0 and j i[) (x) dx — j cp(x)dx>-0 

a a a 

Hence, we have 

P P 

j" cp (a:) dx-*C j* tJj (a:) dx (17) 

a * a 

Thus, inequality (16) implies inequality (17) which means that 
the sign of inequality is retained when we integrate an inequality 
in the positive direction. (Think about the changes that must be 
made in the assertion if an inequality is integrated in the negative 
direction.) 

As above, if inequality (16) holds then inequality (17) turns into 
the equality if and only if <p (a:) = -vp (x) for a < x < p in the case 
of continuous functions cp ( x ) and iji (x) although in the case of dis- 
continuous integrable functions we can have the equality even if 
qp (x) is different from op (x) at separate points. 

As a consequence of inequality (17) we obtain a crude estimation 
of the definite integral: let 

fmin fmax (a ^ x ^ P) 

where / m! - n and f max are two constants. Then integrating [these ine- 
qualities we obtain 

P 

fmin (P ^)'^ ^ / (®) dx-^. fmax (P — Cx) ( 16 ) 


28-0141 
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In connection with this estimation let us consider the important 
notion of the mean value of a function which is also called “the 
arithmetic mean ” of the function. If a function f (x) is regarded as 
being defined over an interval a ^ x ^ P then its mean value on 
the interval is a constant / such that the integral of the constant 
over the interval a ^ x ^ p is equal to the integral of the function 

e _ a 

taken from a to p. Thus, ^ / dx = j / (x) dx i.e. 

a a 

0 

j f(x)dx = f-$ — a) 

a 

The last formula (the first mean value theorem) implies the following 
expression for the mean value: 

- i r 

f=j— J fW dx ( 19 ) 

a 

As would be expected, inequality (18) implies that 


fmin ^ ^ fmax 

The geometric meaning of the mean of a function is illustrated 
in Fig. 250. We see that / satisfies the condition that the area of 

the rectangle AB'C'D is equal to 
the area of the curvilinear trapezoid 
ABCD. It is clear that if the function 
/ (x) is continuous it takes on the value / 
at a point belonging to the interval 
a ^ x ^ p (at the point y in Fig. 250)*. 
A discontinuous function may not assume 
its mean value. 

The advisability of the above defini- 
tion of the mean of a function is well 
seen if we consider an example of a func- 
tional relationship between the instan- 
taneous velocity of a non-uniform motion 
of a point and a current moment of time. 
The integral of the velocity over a time 
interval being equal to the distance passed [see the first for- 
mula (8)], we see that the mean value of the velocity on a time 



* Hence, for a continuous function / (x), the first mean value theorem is 
P 

written in the form J / (x) dx — f (y) {fi — a) where y is a certain point belong- 

a 

ng to the interval a ^ x ^ [5. — Tr. 
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interval is a constant velocity such that if the point moved 
uniformly with this velocity it would cover the same distance as 
in its non-uniform motion in question, during the same time inter- 
val. In other words, formula (19) implies that the mean value of the 
velocity on a finite time interval is equal to the ratio of the distance 
covered to the time taken. Hence, this notion agrees with the well- 
known notion of a mean velocity. The notions of a mean density, 
mean power etc. are also in agreement with the general notion of 
the mean value of a function. 

If a function is defined over an infinite interval, for instance, in 
the interval a ^ x < oo , then its mean value is defined as 




(y = const, a^Y<°o) 


that is as the limit of the mean value corresponding to a finite inter- 
val in the process when the length of the interval is increased unli- 
mitedly. It is easy to verify that if the limit exists it does not depend 
on the choice of the value y (which can also be taken as equal to a). 

Let us consider an alternating current circuit in which the current 
flow / and the voltage u are expressed by the formulas f = 
= 7o cos (mt + a) and u = u 0 cos (o it + a + tp) where cp is a con- 
stant phase shift between the voltage and the current. The mean 
power of the current in this circuit is equal to 

T 


h = ju = lim 

T~> oo 


T 


j 7 0 cos (at -f a) u 0 cos (mi -f a + cp) dt — 

u 

T 


= lim 1 [cos (2a \t -j- 2a -f- q>) + cos (p] dt = 


= {^T— P " r+2 ° 1+ r’ ) - Sl ° (2g+ ' p ) + ^ cost 

This formula accounts for the significance of the quantity cos (p 
in electrical engineering. 

In conclusion, we give one more inequality which is sometimes 
applied. Since the absolute value of a sum cannot exceed the sum 
of the absolute values (see the end of Sec. 1.5) we can write the ine- 
quality 

I S / (Ift) Ax, |< 2 I / (I,) Ax h | = S | / (t k ) | 

R=1 )i=l h=l 

for integral sum (6). Then passing to the limit we get 

6 . b 

J / (x) dx |< J | / (x) | dx (20) 

a a 


28 * 
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In other words, the absolute value of an integral does not exceed 
the integral of the absolute value of the integrand [think in what 
cases inequality (20) turns into the equality]. 


§ 2. Applications of Definite Integral 

6. Two Schemes of Application. There are two basic schemes of 
application of the definite integral to calculating geometrical, phy- 
sical and other quantities. 

The first scheme is based on the definition of the integral which 
introduces it as the limit of an integral sum [see formula (7)]. 
According to this scheme, a quantity in question is approximately 
represented as an integral sum so that the representation should 
become more and still more precise, as the lengths of the subintervals 
of partitions are decreased, and quite exact after the passage to the 
limit. Therefore the quantity turns out to be equal to the limit of 
the integral sums, i.e. to the integral. This method was clearly illu- 
strated by the examples considered in Sec. 1 which led to four inte- 
grals (8). As it was indicated in Sec. 2, this scheme is based on the 
representation of a definite integral in the form of a sum of infinitely 
many infinitesimal summands. 

The second scheme of application of integrals consists in forming 
a relationship between the differentials of quantities in question, 
that is in forming a so-called differential equation. After the rela- 
tionship between the differentials has been deduced we apply for- 
mula (13), which can also be put down in the form 

J dlj = y terminal U initial 

and thus obtain a relationship between the quantities themselves. 
The meaning of the above formula is that the sum of infinitesimal 
increments of a quantity is equal to the total increment of the quan- 
tity. 

Let us consider an example. Suppose a point is moving along the 
s-axis under the action of a variable force which is directed along 
the axis and assumes the value F (s) at each point s. Let the point 
pass the distance from s = a to s = b and let it be necessary to 
compute the work A total of the force along this path. The work A 
of the force performed in the process of motion is connected by 
a functional relationship with the distance passed, that is A = A ( s ). 
When the point passes a small interval from s to s -f- As the force 
does not change considerably and we can therefore approximately 
regard it as constant along this small path and, according to the 
well-known physical formula, we can write 

AA m F (s) As 
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A more precise formula has the form 

AA = F (s) As + a (21) 

where | a | < As, i.e. a is of higher order of smallness relative 
to As. The fact that a is indeed an infinitesimal of higher order 
of smallness is implied by the following consideration: a is caused 
by the variability of F on the interval As but the variation of F 
is infinitesimal when As is infinitesimal and, besides, this variation 
is multiplied by As when AA is calculated. 

Now if we recall that a differential is defined as the principal 
(linear) part of the corresponding increment (see Sec. IV.8) we can 
write, on the basis of (21), -that 

dA = F (s) ds (22) 

Integrating we obtain 

b 

Aiotai == A (b) — A (a) = ^ dA = f F (s) ds 

a 

This formula is often written in a simplified form as 

A=jFds 

Although the limits of integration are not put down in the last 
formula the integral is understood as a definite integral having cer- 
tain limits of integration. 

In problem-solving practice the above detailed consideration is 
usually replaced by the following simplified consideration: the 
force can be regarded as constant along the infinitesimal path ds 
and this immediately implies formula (22) for the corresponding 
infinitesimal increment of the work. Then formula (22) is integrated 
etc. This consideration is brief but quite correct, and if we discuss 
it at length we shall arrive at the comprehensive consideration which 
was given previously. We shall turn back to this question in Sec. 
XVI. 4. 

7. Differential Equation with Variables Separable. The general 
form of a relationship of type (22) can be written as 

ty = / (*) dx (23) 

where x and y are some variables connected by a functional rela- 
tionship. Integrating we deduce 

."Cl 

l/i—J/o = [ f{x)dx 
*o 

where y 0 — y (x 0 ) and y x = y (aq). 
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Equation (23) is the simplest differential equation. Differential 
equations will be treated in more detail in Chapter XV but some 
simple examples that can be considered without applying the gene- 
ral theory can be illustrated here. For instance, we often encounter 
differential equations of the form 

dy = y ( y ) dx (24) 

We cannot simply integrate both sides of the equation because in 
this case the integrand under the sign of integration on the right- 
hand side would contain an unknown function y ( x ). We must there- 
fore transpose (p (y) to the left-hand side, that is we must write 

= dx beforehand. Then the integration yields 

VI 

i vk> :=Xi ~ Xo yi=y( x i)) 

vo 

Similarly, an equation of the form 

dy = / (x) cp (y) dx (25) 

is integrated as follows: 

Vi *1 

W= f W dx ' 1 vk=] nx)dx 

Vo *o 

Equations (23)-(25) are called differential equations with variables 
separable because the terms containing x and dx can be separated 
from the terms containing y and dy by means of simple algebraic 
transformations, and after this the integration is carried out imme- 
diately. 

As an example, let us consider the problem of outflow of a liquid 
from a cylindric vessel through an opening of area a at the bottom 
of the vessel (see Fig. 251). Here the height li of the level of the 
liquid above the bottom depends on the time t, i.e. li = h {t). If 
the liquid is not viscous, and if it is permissible to neglect the forces 
of surface tension, the exit velocity v with which the liquid flows 
out of the vessel is described, within a sufficient accuracy, by Torri- 
celli’s law (established by E. Torricelli, 1608-1647, a prominent 
Italian physicist and mathematician): 

v = Y 2 gh (26) 

We can readily form the differential equation of the problem on 
' the basis of this law. Let us involve brief considerations similar 
to that in the last paragraph of Sec. 6. The exit velocity can be 
regarded as constant during the time interval dt, and therefore, by 
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formula (26), the corresponding outflow is the volume dV — cr -v dt — 
= a }f 2 gh dt of the liquid. 

On the other hand, the same volume is equal to dV = S | dh | == 
— — s dh. (One should take into account that h decreases here and 
therefore dh < 0.) Equating both expressions of the volume we 
obtain the equation 

-Sdh=aV2ghdt ( 27 ) 


belonging to type (24). In order to integrate the equation let us 
separate the terms depending on h (and on dli) from the terms depend- 
ing on t (and dt): 


a 1 / 2 gh 

Integrating we receive 



Sdh 

(J-]/2gh 


(H = h{ 0)) 


where T is the total time of outflow of the liquid. We finally obtain 


S 

oV2i 


2V^| t°H = T, 



8. Computing Areas of Plane Geometric Figures. The application 
of the definite integral to computing the area of a curvilinear tra- 




Fig. 252 


pezoid was considered in Sec. 2, and the corresponding rule of signs 
was illustrated in Fig. 245. s 

• I f, it „ is , n ® cessar y to compute the whole area shaded in Fig 245 
m the arithmetic” sense (i.e. not in the “algebraic sense”) we can 
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utilize the formula 

b 

$1 + $2 “}* ^3 — | I / ( x ) 1 d x 

a 

where the last integral can be evaluated according to the scheme 
described in example 1 of Sec. 4. 

The computation of the areas of figures other than curvilinear 
trapezoids can also he performed with the help of integrals. For 
instance, the area of the geometric figure shown in Fig. 252 can 
be obtained as the difference of the areas of the two curvilinear 
trapezoids ACMDBA and ACNDBA, i.e. 

6 6 6 6 

S = j /, (x) dx — j / 2 ( x ) dx — j [/j (x) — / 2 (a:)] dx = j h (x) dx (28) 

a a a a 


where h ( x ) is the length of the segment which is formed when the 
straight line parallel to the g-axis and passing through the point x 
of the 2 -axis crosses the figure. 

Formula (28) can also be interpreted as follows. The area of the 
portion of our figure lying to the left of the straight line x = const 
depends on x. If we give x an infinitesimal increment dx (see Fig. 252) 
then the area of the strip shown in Fig. 252 is added to the former 
area. This additional area can be regarded, to within infinitesimals 
of higher order of smallness, as a rectangle (compare this with the 
deduction of formulas (21) and (22)]. It follows that dS = h ( x ) dx. 
Now, integrating, we obtain formula (28) again. 

The contour of a figure, that is the curve which bounds its area, 
is often represented in parametric form. In such cases it is advisable 
to perform a change of variables in the integrals in question and 
choose the parameter as a new variable. 

For example, let us compute the area bounded by the x-axis and 
by an arc of a cycloid (see Sec. II. 6) with parametric equations 
(11.12). We mean here a part of the cycloid which connects two 
neighbouring points of intersection of the cycloid with the x-axis 
(i.e. an arc lying between two cusps), and the parameter should 
therefore be taken within the limits 0 ^ ^ 2 jt. Hence, 

2nR 2n 

S= | ydx—\^ R( 1 — cosi];) d [i? (i[) — sin ![))] = 
o o 

2.f 

= R 2 j" (1 — cos i])) 2 = 3 jti? 2 

o 
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because 


^ (1 — cos \p) 2 ffTjj = J (1 — 2cosij)-f cos 2 \J)) = — 2sin 

+ Ji±f5S = 4 S,_2sm>+ 


Now. let us consider the area of a geometric figure bounded by 
a closed contour ( L ) represented by its parametric equations x = 
= x (t) and y = y (t). Suppose that the variable point ( x , y) [where 
x = x (t) and y — y (£)] describes the contour in the positive direc- 
tion once when the parameter t varies from a to y (the positive direc- 
tion is understood as counterclockwise; see Fig. 253). Then 


b b 

S = j ijidx— j y z dx 

a a 

fi 

But the first integral is equal to j yx dt because x varies from 

v 

a to b as t varies from y to p (see Fig. 253) and we have y = and 

xdt = dx here. The second integral is transformed similarly and 
thus we obtain 



(29) 


Property 10 in Sec. 4 implies that the values t = a and t = y 
are not necessarily such that the corresponding points (a;, y) should 
coincide with the extreme left point of the contour; the necessary 
condition is that the contour should be described exactly once as t 
varies from a to y. 

Similarly, projecting the contour on the p-axis we can deduce 
the formula 

v 

S = ^ xy dt (30) 

a 

If we add together formulas (29) and (30) we arrive at the formula 

v 

s = 4" l ( x y— y x ) dt (3i) 

a 

If the contour is described in the negative direction, as t increases, 
we must change the signs in all formulas (29)-(31). 
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For example, the area of an ellipse having parametric equations 
(11.26) can be computed on the basis of formula (31): 

2 re 2rt 

S — ~2 j l a cos t-b cost — b sin t ( — a sin t )] dt=~ J abdt = nab 
o o 

Let us proceed to compute areas in polar coordinates. Let a curve 
be represented by its polar equation p = / (qp) and let it be necessary 
to compute the area of the curvilinear “sector” a ^ 9 ^ p (see 
Fig. 254). If the angle <p is increased by dcp then the area of the por- 
tion of the sector lying below the ray ON also gains an increment, 



Fig. 253 


Fig. 254 


the additional area being regarded as an isosceles triangle with 
altitude p and base p dcp within an accuracy to infinitesimals of 
higher order of smallness (why is it so?). Hence, 



S 


1 _ 

2 


f? 

j p 2 dcp 


a 


(32) 


As an example, let us compute the area shaded in Fig. 255. 
Passing to polar coordinates in the equation of the hyperbola we 
obtain 

p 2 cos 2 cp — p 2 sin 2 cp = 1 , i.e. p 2 = — 5 x-x — 

T r T ’ 1 cos 2 q> — sin 2 (p 

Consequently, by formula (32), we derive 


S 



5 


1 

cos 2 (p — sin 2 q> 



1 +tan q> 
1 — tamp 


(verify the calculations!). 
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The above result implies an interesting consequence. We have 
'■*— - fi4S “ 1 « 2S - e " 2S : tanh 25 


1 + tancp _ 4S 


i.e. 


tan cp ■ 


1— tan 9 
(see Sec. 1.28), and therefore 


AS 


+ 1 


e 2S + e 


— 2S 


NM = p sin cp = 


sin tp 


tan cp 


l/cos 2 cp — sin 2 cp V 1- tan 2 cp 

tanh 25 . tanh 25 - = sinh 25 


q/l — tanh 2 25 


1 




We similarly find that ON — p. cos cp — cosh 2 5 and A.P 
— tan cp = tanh 25. The comparison of these results with Fig. 256 
where cp = 25, MN — sin 25, ON = cos 25 and AP = tan 25 



reveals the geometric meaning of the resemblance between trigo- 
nometric (circular) functions and hyperbolic functions and accounts 
for the origin of the terms “hyperbolic” sine, cosine and tangent. 

9. The Arc Length of a Curve. We have already dealt with the 
differential of the arc length in our course (see Sec. VII. 23). We 
shall denote it by dL: 

dL = Y dx 2 dy 2 dz 2 = x 2 + y 2 + z 2 dt 

We shall agree that dL > 0 and therefore take the sign + in 
front of the radical. It follows that if the values t = a and t = P 
of the parameter correspond to the end-points of the arc then its 
length is 

L— j V i 2 + y 2 + z 2 dt 

a 
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For a plane curve the formula of the arc length is simplified: 


L = 


| Y dx" -f- dy 2 = j 1/ x 2 + if dt 
a a. 


If a curve is represented by an equation of the form y = / (z) 
{a ^ x ^ b) then 

P b 

L= J Vdx 2 +dy 2 =: J yi + y’ 2 dx 

a a 


For instance, the arc length of the part of a cycloid [represented 
by equations (11.12)] between its neighbouring spinodes (cusps) is 

computed with the help of the above formula: L—\V x 2 -\-y 2 dt— 

a 
2 7t 


— j" dif. In computing the arc length we take 


into account the symmetry of the arc: 


L — 2 j R ]^(if — sin if)' 2 -f (1 — cos if)' 2 dty = 
o 


= 2 if 


-*r 

| V (1 — cos if) 2 -f- sin 2 if chf = 2R | V"2 — 2cosof dif = 
o o 


= 4 R j sin -y dif = — SR cos ” = 81? 
o 


The result is extremely simple! 

The differential of the arc length in polar coordinates is readily 
obtained from Fig. 257: 

dL = Y (dp) 2 -}- (p dtp) 2 (33) 

It follows that if the equation of a curve in polar coordinates is given 
in the form p = / (< p) then its arc length corresponding to the inter- 
val a ^ <p ^ p of variation of cp is equal to 

j Ydp 2 +p 2 d<p'-= J j/ (^) 2 + p 2 dcp 
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We suggest that the reader should check that expression (33) can 
also he obtained from the formulas 

dL = Y dx" + dy-, x = p cos cp and y = psin cp 

As an example, let us consider the arc length of the cardioid 
depicted, in Fig. 72. Taking advantage of its polar equation put 
down in Fig. 72 we find the arc length: 

Jt 

L = 2 | V 4« 2 sin 2 tp -j- 4a 2 (1 — cos q>) 2 ch p = 16c 
o 

(verify the result!). 

10. Computing Volumes of Solids. Suppose that we have a solid 
and that we know the areas of its parallel sections by planes per- 
pendicular to an axis (see Fig. 258). Let it be necessary to compute 
the volume of the solid. Let x be the coordinate reckoned along the 



axis and let the area of the section by the plane passing through 
the point x be S — S (a:). Take the volume of the' part of the solid 
lying to the left of the plane passing through x. If we increase x 
by Ax = dx (for definiteness, we take Ax > 0) . then the plane is 
moved to the right, and an additional volume is added to the former 
volume. This additional volume is the volume of the “slice” which 
can be regarded as a cylinder with a wide base of area S (x) and 
small altitude dx, within an accuracy to infinitesimals of higher 
order of smallness. It follows that 

AF = & ( x ) Ax + infinitesimals of higher order 
dV = S (x) dx 

Now if x varies from a to b we obtain 
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b 

V = J S ( x ) dx (34) 

a 

Let us take an example. We shall compute the volume of a solid 
which is bounded by the surface of a right circular cylinder, by the 
half-disc lying in a plane perpendicular to the axis of the cylinder 
and by the part of an oblique plane passing through the diameter 
of the disc. The solid is depicted in Fig. 259; it has the form of a 



Fig 259 


Fig. 260 


“hoof”. Since the triangles ABC and A'B'C' are similar according 
to Fig. 259, the area of the section which is shaded in Fig. 259 is 
equal to 


S = 



i ? 2 — Z 2 

i ? 2 


fl 2 — I 2 
2 R 


Id 


Consequently, by formula (34), we have 


V~2[ |*~f iW 

0 I) 

Note that the number n does not enter into the answer! 

We now consider the volume of a solid of revolution. Let a curve 
having the equation y = f (z) rotate in space about the z-axis and 
let it describe the boundary surface of the solid of revolution. Then 
the area of the section of the solid by the plane perpendicular to 
the a;-axis and passing through the point x is equal to S = ny z 
where y — / (x) (see Fig. 260). Hence, by formula (34), we have 

b b 

V = n j' y- dx = n j f 1 (x) dx (35) 

a a 

For instance, we can regard a sphere of radius R as a solid gene- 
rated by the revolution of the semi-circle having the equation y = 
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— y R' 1 — x‘- r . The volume of the sphere is therefore equal to 

y = ix>2 J (V jR 2 — x 2 ) 2 dx=2rc (r 2 x-^-).|* =4 kZ?3 
o 

Compare the above deduction of the formula with the long and 
artificial procedure applied to deducing the formula of the volume 
of a sphere in elementary mathematical courses. 

11. Computing Area of Surface of Revolution. The formula for 
computing the area of an arbitrary surface will be established in 



Sec. XVI. 10. But the computation of the area of a surface of revo- 
lution can be discussed now. Let a curve y = / (x) > 0 rotate about 
the x-axis (see Fig. 261). Let us draw a plane perpendicular to the 
x-axis and passing through the point x where x is considered to be 
variable. Take the area of the portion of the surface of revolution 
lying to the left of the plane. 

If the plane is translated by the distance dx to the right the area 
of the portion is increased by the surface element which is shaded 
in Fig. 261. The element has the form of a ring-shaped surface of 
width dL with the circumference (length) 2ny because the radius 
of the ring is equal to y. Consequently, we have 

dS = 2 k y dL — 2 ny ]/!-{- y' 2 dx 


S = 2n j ydL = 2n j yYi + y' 2 dx 

a a 


( 36 ) 
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As an example, let us compute the area of the portion of a para- 
boloid of revolution intercepted by a plane perpendicular to the 
axis of revolution (see Fig. 262). Let the radius R of the base and 
the altitude H he given. The surface being generated by the revo- 
lution of a parabola whose axis coincides with the z-axis, the equa- 
tion of the curve has the form x = ky 2 . The constant k must be 
chosen so that the parabola should pass through the point ( H , R ). 

H 

This implies H = kR z , i.e. k = -p . Finally, the equation of the 
curve is 

x= -wy 2 ^ or it 

Taking advantage of formula (36) we obtain 

0 

ff ' H 

= 2 n^g J YZ y 1 +§.A-dx = rfJ j L J Y 4xH + R 2 dx = 

1 V o o 


R ( 4 xH-t-R*) 312 
“ (*)■« 


h 

0 


-ggl-l(4 H* + R>) 312 




§ 3. Numerical Integration 

12. General Remarks. The basic method of evaluating a definite 
integral by means of the corresponding indefinite integral described 
in Sec. 3 is sometimes inexpedient and even practically inappli- 
cable. As it was indicated in Sec. XIII. 11, there are many indefinite 
integrals of elementary functions which cannot be expressed in 
terms of elementary functions or which have such expressions that 
are too complicated. Besides, a function which we have to integrate 
can be represented in a way which does not yield its analytical 
expression. In these cases one can use a number of methods which 
we are going to review here. 

1. Some integrals are expressible in terms of certain thoroughly 
studied and tabulated non-elementary “special" functions. 

For instance, one of these functions is the error function 

* X 

Erf x — ^ e~ t2 dt ( — oo <^_x <i co) (36') 

so' 

Further examples are the Fresnel integrals (named after A. Fresnel, 
1788-1827, a prominent French physicist, the creator of the wave 
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theory of light), 

x x 

C (x) = f cos dt and 5 (x) = J sin dt ( - oo < z < oo) 
o u 

the exponential integral 

rc 

Eiz= J ~dl ( — oo<a:<0) 


the sine integral 

X 

Si a: — j —j— dt (— oo<x<oo) 
o 

the cosine integral 

Gi x = j dt (0 < x < 00 ) 

oo 

and many other functions. Integrals with infinite limits will be 
considered in full in § 4. 

Let us take an example. In order to evaluate the integral 

l 

, f sin l 2 x j 
/= J -fi-dx 

o 


we integrate it by parts putting u — sin 2 x and dv = x 2 dx: 

* + r 2 sin x c os x . dx _ _ sin2 . 1+ f 

0 J x J ^ 

0 0 

Now^performing the change of variable 2 x — t we get 

2 

/= — sin 2 1 + ^ = - sin 2 1+ Si 2 = 0.8973 

o 

The value of the sine integral is taken from [23]. A great number 
of special functions are described in this book. 

2. It is sometimes possible to find the exact value of a definite 
integral with certain specific limits without calculating the corres- 
ponding indefinite integral. Calculations of this type are usually 
difficult but nevertheless we shall give some examples further [for 
instance, see formula (72)]. Many integrals of this kind are collected 
in [19]. 
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For example, in this book we can find the formulas 

■T 

2 

j Un v xdx = ^- (cos-^-j 1 ( — 1 <p< 1), 

o 

71 

j In sin x dx = — n In 2 etc. 
o 

but the corresponding indefinite integrals are not elementary func- 
tions. 

3. Expansions of the integrand into series of different types are 
also often used for integration. This method will be described at 
length in Chapter XVII but the simple power series which were 
mentioned in Sec. IV.16 can be readily applied now. 

For instance, taking series (IV. 55) for the function e x we obtain 



' I ( I! + 2! + 3! + • • • ) dx— j| + 2-2! + 3 . 3 , + • • • — 1-318 
o 

(the result is accurate to 0.001). 

It was noted in Sec. IV.16 that in practice such series can be 
treated as finite sums in which the number of terms is taken depend- 
ing on the accuracy chosen. 

4. Graphical integration is used when a function is represented 
by its graph. The method is based on the geometric meaning of the 
definite integral (see Sec. 2) which implies that the integral is equal 
to the area of the corresponding curvilinear trapezoid. We can com- 
pute the area either by constructing the graph on the plotting paper 
and figuring the number of squares lying inside the bounding line 
or by using a special instrument, the so-called planimeter. After 
the tracer of the planimeter has been passed round the periphery 
of the area of an arbitrary form which is to be measured we read 
the area on the meter of the planimeter. Since a planimeter is a 
simple mechanism integration with its help is called mechanical 
integration. 

5. The most comprehensive methods applicable to integrals of 
arbitrary functions are the methods of numerical integration which 
are reviewed in Sec. 13. These methods can be used for functions 
represented in any possible way, especially for functions represented 
by means of tables. 

13. Formulas of Numerical Integration. These formulas make it 
possible to evaluate approximate values of a definite integral if 
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the values of the integrand at certain points (so-called nodes) of the 
interval of integration are given. 

Let us begin with the most elementary formula. Let it be necessary 

to evaluate the integral 

b 

^ ydx, y = f (x) (37) 

a 

Suppose that the interval of integration a ^ x ^ b is divided into 
a finite number n of equal parts and that the values of the integrand 
at the points of division .are given or calculated. 

Introduce the notation 

f(a) = y 0 , / (a + h) = y u 

f (a 2h) = y 2 , f (a + nil) = / (b) = y n 

If we draw the ordinates at each of the nodes the curvilinear tra- 
pezoid whose area is equal to integral (37) is broken into n parts 
(see Fig. 263). Each of these parts 
is also a curvilinear trapezoid. Now 
let us replace the parts by rectili- 
near trapezoids whose bases are 
pairs of neighbouring ordinates (see 
Fig. 263). The areas of these trape- 
zoids are equal to 

J/O+J/l 7, tfl-il/2 7, Vn-l + 'Jn 7 , 

2 2 **»•••» 2 

Adding the areas together we 
obtain the area of a polygonal figure 
inscribed into the original curvi- 
linear trapezoid. If n is sufficiently 
large, that is if h is sufficiently small, the area of the polygon will 
be approximately equal to the area of the curvilinear trapezoid, 
i.e. to the integral. Thus, we obtain 

b 

J ydx^V°pL h+ yi^ h+ ' m , + Vnzl+Hn h 

a ~ 

or 



| ydxzah (— + J Ji + Vz + • • . + y n -i) ( 38 ) 

a ' 

This is the so-called trapezoid formula (trapezoid rule) 

. We . can g ive an interpretation of the trapezoid formula which 
is independent of its geometric meaning. Virtually, before inte- 

29 * 
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grating the function we substituted it by linear functions which 
take on the values of the integrand p = / (x) assumed at the end- 
points of each interval of the form a x a h, a 4 A ^ x ^ 
^ a 4 2h etc. Hence, we can say that we have performed linear 
interpolation (see Sec. 1.22) before evaluating the integral. Now 
let us recall that we gave interpolation formulas in § V.2 which 
approximate functions with a greater accuracy than formulas of 
linear interpolation. Therefore formulas of numerical integration 
based on these interpolation formulas are much more accurate than 
formula (38). 

If we use interpolation polynomials of -the second degree we shall 
arrive at Simpson’s formula named after the English mathematician 
T. Simpson (1710-1761) who deduced the formula. Let us first suppose 
that we know the values of the integrand at three points of the form 
x 0 , x 0 h and x 0 -f- 2 h, that is we are given 

y fro) = y o» y (Zo 4 h) = p* and y (x 0 + 2h) = p 2 

Then we can put down the interpolation polynomial of the second 
degree which assumes the same values at these points. By Newton’s 
formula (see Sec. V.8), the polynomial is 

^W = yo-fA//o{+^|-({- 1 ) (s = z — x 0 ) 

Now performing the interpolation on the subinterval x„ ^ x ^ 
^ x Q 4 2h we get, by means of the substitution x — x 0 — s, dx = 
= ds, 0 ^ s ^ 2 h, the expression 

.fo-r 2/i 2/i 

J f<*).fe=J[*+A*-!+-^(£- r)]*= 

*0 0 

= y 0 .2k + Ay 0 .2h + ^p-.^h 
Further, if we substitute 

Ap 0 = yi — y 0 , 

A 2 Po = Api — Ap 0 = (p 2 — p,) — (p t — p 0 ) = p 2 — 2p t 4 p 0 
then after collecting similar terms we obtain 

xq-^ 2 h 

j / (x) dx zz 

XQ *o 

— 2A [po 4 (pi — Po) 4 ■g' {y 2 — 2p t -f p 0 ) J = h ^ ■ Vz . (39) 

Now suppose that the interval of integration a ^ x ^ b is divi- 
ded into 2 n equal subintervals with the help of the points of division 

x 0 — a, x t = a 4 4 x 2 = a 4 2 h, . . . 

. . x 2n = a 4 2,nh = b 
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where h — . Then we can apply formula (39) to each pair 

of subintervals of the form xo + ( 2 & — 2 ) h ^ x ^ x 0 + ( 2 k 1 ) h 

and a:o+ ( 2 fc — 1 ) h gC! x ^ x 0 + 2 fefe ( k — 1 , 2 ,.. ., n). 


* 0+2 ft 

j ydxtzz 
*0 


7. J/0+-I/1 + J/2 

n 3 


--C0+4?! 

i y dz 

*o+2h 


7 y2~T^!/3~h j/4 
3 


*0+2 n/i 

\ 


*o+(2n-2)/i 


y dx 


y"n-2~j~^y2n-i ~T~ U2 n 

3 


Adding together these formulas and collecting similar terms we 
receive Simpson’s formula: 

6 

| Ijdxtt -|-[(l/o + y2n)+2 Q/2+Z/4+ ••• + !/ 2 n-2)+4 (y 1 +IJ 3 + ■ • • +iJ 2n-l)] 

(40) 


Now we proceed to estimate the accuracy of formulas (38) and 
(40). Newton’s formula (V.27) implies that in performing linear 
interpolation on a sub interval we get an error of the order of A z y, 
i.e. of the order of h 2 (see Sec. V.7). According to formula (18), we 
can estimate the corresponding absolute error of the integral taken 
over the subinterval if we multiply the error of the interpolation 
by the length h of the subinterval of integration. Hence, the error 
of the integral on a subinterval is of the order of A 3 . Formula (38) 
is obtained by adding together n approximate formulas having 'the 
errors of the order of h 3 . Therefore, the number of subintervals being 

equal to n — , the resultant error is of the order of 


Tl.h 3 = ~^h 3 = (b — a)-h 3 


that is of the order of /r. For instance, if we increase the number of 
the division points twice the degree of accuracy of formula (38) 
will increase, approximately, four times. 

One can think that analogous considerations applied to formula 
(40) must indicate that its error is of the order of h 3 . But this is 
wrong because in reality the accuracy is still higher. Actually, when 
dropping the term containing A 3 i / 0 which enters into Newton’s 
formula we make an error of the order of h 3 . But it turns out that 
the integral of the term is equal to zero because 

*0+2/1 2h 

1 T(T-‘)(T-*)*-J[(H , - s (x) , + a x]*- 

-*o 0 


(jL_ 

3s 3 

9 S 2 

\4A 3 

~ 3/t 2 

r 2 h 



454 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


Hence, the error which occurs, after integration, is determined by 
the subsequent term of Newton’s formula. This subsequent term 
is of the order of h* . Consequently, the error of formula (39) is of 
the order of h h and the error of final formula (40) is of the order 
of h l . For example, if the number of the points of division is increa- 
sed twice the accuracy of formula (40) increases 16 times. At the 
same time the application of formula (40) is not much more compli- 
cated than that of formula (38). 

Let us take an example. The exact value of the integral I = 
i 



1 +£ 


5 dx is readily found: 


/ = j y-4— ; s dx — arc tan x |o =-^- = 0.785 
o 

If we did not know the answer we could evaluate the integral appro- 
ximately by means of formula (38) or (40). For simplicity’s sake, 
let us take n—2, that is 

h — 0.5, x 0 = 0, — 0.5, x z — \, y 0 = 1.000, 

y, = 0.800, y 2 = 0.500 
Applying formula (38) we get the value 

/~0.5 ( 1- 00 1 ± 0 500 + 0.800 j — 0.775 

whose error is »1.3 per cent. The calculations according to for- 
mula (40) yield the value 

/ « ^ (i .000 -f 0. 500 + 4 x 0.800) = 0. 783 

the,error being about 0.3 per cent. If we had taken n = 10 the error 
of formula (38) would have been about 0.05 per cent and the error 
of formula (40) about 10 -6 per cent. 

In books devoted to numerical methods in mathematics one can 
find formulas of approximate integration that are more accurate 
’than Simpson’s formula. We sometimes use formulas constructed 
for nodes which are not equally spaced. 


§ 4. Improper Integrals 

Up till now we have considered definite integrals with finite 
intervals of integration and with integrands which do not approach 
infinity on the intervals. We shall call such integrals proper. If 
at least one of the above conditions is not fulfilled the integral is 
called improper. A proper integral of a continuous function (and 
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of a function of some wider class) always lias a certain numerical 
value. In contrast to it, improper integrals which we are going 
to study here may not have such a value. 

14. Integrals with Infinite Limits of Integration. First let us take 
an integral of the form 

co 

7=| f(x)dx (41) 

a 

where the lower limit a and the integrand / ( x ) (considered on a ^ 
<c; x < oo) are supposed to be finite. This integral is improper be- 
cause its upper limit is infinitely large. 

To define integral (41) in an exact sense we use the same approach 
as that applied in Sec. III. 6 to the sum of an infinite series. Namely, 
we first “truncate” the integral, i.e. we cut off an infinite portion 
of its interval of integration and consider the integral 

N 

f f{x)dx (42) 

a 

where N is a large but finite number. Integral (42) is proper and 
possesses a certain numerical value. Then we make N tend to infinity 
because in integral (41) we have the sign of infinity as the upper 
limit of integration. Integral (42) varies in a certain manner as 
N -*■ oo. If, in this process, it has a certain finite limit we say that 
integral (41) is convergent. In this case we put, by definition, 

oo N 

f / (x) dx — lim \ f (x) dx (43) 

v IV-4-OO J 

a a 

If there is no finite limit integral (41) is said to be divergent. 
In such a case we shall not define a numerical value of the integral 
in our course (although even in this case it is sometimes possible 
to speak about the value of the integral). Hence, we shall speak 
about the numerical value of an improper integral of type (41) 
only if it converges. 

Let us note a particular case of divergence: if integral (42) has 
an infinite limit as N oo then integral (41) is said to diverge 
to itifinity , in this case formula (43) makes sense and can be used. 

Let us consider several examples. Suppose a point T is moving 
under the action of a force which is directed along the straight 
line connecting T with a fixed point 0 (from T to 0) and whose 
absolute value is inversely proportional to the square of the distance 
from O to T. In particular, gravitational force and force of attrac- 
tion between two charges of electricity are of this kind. Suppose 
it is necessary to compute the work which should be expended to 
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remove the point T from a position T 0 into infinity. This work is 
called the potential of the force. 

To perform the computation let us write the expression of the 
force: 

F=~ (s = OT) 

where k is a proportionality factor. Then, by Sec. 6, the work is 

cc 

A=^ds (s 0 = OT 0 ) (44) 

so 

This is an improper integral which must be calculated by formula 
(43): 


jV 


A = lim [ 

A ds = lim — 

*=<V 

= lim i 


k 

iV->OD J 

S ' iY— ►co S 

J =*o N-y oo 

l s 0 N ) 

s o 


Thus, integral (44) converges. We see that the potential is inver- 
sely proportional to the first power of the distance from 0 to T 0 . 
At first glance one can find it strange that the work corresponding 
to the motion along an infinite path turns out to be finite although, 



theoretically, the force never stops acting on the point. The expla- 
nation is that the force decreases so fast as the point moves towards 
infinity that the expended work tends to a finite quantity but not 
to infinity although it increases all the time. The geometric meaning 
of the result is illustrated in Fig. 264: despite the fact that the shaded 

geometric figure extends to infinity its altitude ^that is its ordinate 

F = jj) decreases so fast that its area turns out to be finite. 

Of course, in reality s varies within a finite range extending not 
to infinity but to a finite value S (which is very large) because all 
physical quantities are finite. Hence, in reality we have an integral 
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of- tlie form 


s 



so 


(45) 


But beg inn ing with some sufficiently large value of S this integral 
does not practically vary and therefore it can be replaced by the 
“limiting” integral (44) because it is easier to investigate (44) in 
theoretical studies as the value S is not known exactly. The signi- 
ficance of the convergence of integral (44) is that it provides for 
the possibility of replacing “real” integral (45) by integral (44) for 
large S. From the physical point of view this means that the action 
of 0 upon T can be neglected when the distance from the point O 
becomes sufficiently large. In performing such a replacement we 
do not need the exact value of S ; the only important thing here 
is that we must be sure of S being sufficiently large. 

As a second example, let us consider the improper integral 


[ ~ dx 


(46) 


: OO 


Since 

N 

f 1 |]V 

lim \ — dx~ lim In x | = lim In N = In oo = i 

y~*co J x N-+co ' ' jY->oo 

we see that integral (46) diverges to infinity. Thus, we can write 

00 

| —dx—oo 

1 

Finally, consider the improper integral 

CO 

j sin x dx (47) 

o 

In this case we have neither a finite nor an infinite value of the 
limit 


jV 


lim \ sinxdx = lim ( — cos a:) 

,V-*-oo J 


N 


= lim (1 — cos N) 

N-+0O 


because it does not exist since the values of cos N, as N oo 
“oscillate” within the limits from —1 to +1 all the time. Conse- 
quently, integral (47) does not diverge to infinity but it diverges 
in an oscillating way, its values constantly varying from 0 to 2 
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and back from 2 to 0. Hence, the integral has neither a finite nor 
an infinite value. 

15. Basic Properties of Integrals with Infinite Limits of Integration. 
Many properties of proper integrals are automatically extended to 
improper integrals of form (41). 

First of all, if an antiderivative F (a:) of the function / ( x ) is known 
in the interval a ^ x < oo then 


CO JV 

f(x)dx — lim f f (x) dx = lim[F (N) — F (a)] = F (oo) — F (a) 

A r — voo v A r — vco 

a a 


because F ( oo ) is nothing but the notation for lim F (N). Hence, 

iY-*oo 

in this case improper integral (41) can be evaluated by means of 
formula (11) deduced for proper integrals. The expression F (oo) 
itself indicates whether the integral diverges or converges. 

For instance, in examples (44), (46) and (47) we could have cal- 
culated in the following way: 


5 



k_ «=«> 

S s=s 0 


fc_ 

*0 


k_ 

oo 


k 


s 0 ’ 



= In oo — In 1 = oo 


and 


I 


sin xdx= — cos x 


OO 

0 


— cos oo -f- 1 


The last result shows in fact that integral (47) does not exist since 
the expression cos oo makes no sense. 

All the basic properties enumerated in Sec. 4 also remain true 
for improper integrals with natural exceptions involving the case 
of a divergent integral. For instance, formulas (18) and (19) no 
longer hold because an integral of a nonzero constant taken over 
an infinite interval always diverges. We also note the following 
simple property: if integral (41) converges we have 


co co JV 

[ / (x) dx — j / (x) dx — j / (x) dx-+- 0 
N a o 


If it is difficult to compute the corresponding indefinite integral 
w r e usually begin the investigation of the improper integral with 
testing its convergence or divergence on the basis of special tests 
which we are going to study now. 
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First of all, it should be noted that- the convergence or divergence 
of integral (47) is determined only by the behaviour of the function 
/ ( x ) as x approaches infinity, that is for sufficiently large x. In 

oo OO 

other words, the integrals j f {x) dx and [ f {x) dx converge or 

a b 

diverge simultaneously provided / (x) does not approach infinity 
■at a point lying between a and b. In fact, the difference between the 
integrals is a proper integral having a certain finite numerical value 
and therefore it cannot violate the convergence in case one of the 
integrals converges and it cannot provide 
for the convergence if one of the integ- !/l 
rals diverges. 

We first consider an integral of a non- 
negative function: 

oo 

\ / (x) dx, /(£)> 0 (48) 



Such an integral either converges or ^ 

diverges to infinity since the integral 

taken from a to N is a non-decreasing quantity here as A r -v oo, 
and such a quantity either has a finite limit or tends to infinity 
(see Sec. III. 5). In this case the convergence or divergence means, 
geometrically, that the area of the infinite plane figure shaded 
in Fig. 265 is, respectively, finite or infinite. 

The fact that integral (48) converges or diverges in this case 
can be indicated, respectively, by the formula 


\ f(x)dx<i oo 

a 

or 

CO 

j f (x) dx = oo 

a 

Of course, these formulas cannot be applied to divergent integrals 
of “oscillating” type (47). 

The simplest test for convergence is the comparison test: if 

0 <g(z)</(z) (c<a:<oo) (49) 

CO 

an< ^ j / ( x ) dx < °°, that is this integral converges, then we also 
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CO 

have J g (x) dx < oo, that is the last integral converges too. The 

a 

proof of the test is implied by integration of inequality (49) or by 
the geometric meaning of convergence (see Fig. 265). The same test 
suggests that if conditions (49) hold and the integral of g ( x ) diverges 
(equals infinity) then the integral of / ( x ) also diverges. 

We also use the following test: if 


then the integrals 


m 4 

g (x) X-yca 


k 0 


(k oo) 


\Hx) 


dx 


and 



(50) 


converge or diverge simultaneously (although in the case of conver- 
gence their numerical values can considerably differ even if k = 1 
and a = b). Indeed, condition (50) implies that neither of the func- 
tions / ( x ) and g (. x ) can be considerably larger than the other as 
x — oo, i.e. / (x) ~ kg (x) where ~ is the sign of equivalence (see 
Sec. III. 7). Therefore, if the shaded figure of the type shown in 
Fig. 265 has a finite area for one of the functions the same must be 
true for the other. 

Most often we compare a given integral of form (48) with the 
integral of a power function of the form 


oo 

B 


dx 


(51) 


which can he easily investigated in a direct manner. If p > 1 we have 

oo 

I = ^ x~ v dx = • 


-r-p+l 


-p+ 1 


loo 1 


(— P+l)xP~ 1 |i OO — P+1 p — 1 


<oo 


whereas, for p< 1, we have 


/ 


X~P + 1 


— p-f-1 
Finally, if p = 1 then 


X^~~V |oo 


1 — P ll 


1-p 


- OO 


co 

I = j dx = In a: J” = In oo — In 1 = 


Thus, integral (51) converges for /?> 1 and diverges to infinity 
for p-^1. Hence, by test (50), we can draw the same conclusion 


definite integral 
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concerning integral (48) if 


For instance, 


because 


/(*)~l^r as 


oc 

r dx 

\ W+T 


= CO 


f/as 2 +l 


2 

r 3 Vl + 


that is p = -=-<l ^ ere - We als0 ^ ave 


dx 


since 


1 Vx3 + 1 
1 

Y^+J 


< oo 


and thus p = -^> 1 in this case. Finally, 


j e ~ x 2 dx < oo 


(52) 


because an exponential function (with a negative exponent) decreases 
(as its argument tends to infinity) faster than any power function 
(see Sec. IV.14), and therefore comparison test (49) is applicable. 

Now we turn to integrals of functions which can assume the values 
of arbitrary sign: 


j'/w 


dx 


(53) 


where either / (x) 0 or / (a:) ^ 0 for a given x. For such an inte- 

gral we shall give only one test: if 


j | / (x) | dx < oo (54) 

a 

then integral (53) converges. In this case the integral is said to be 
absolutely convergent, and the function / ( x ) is called absolutely 
integrable on the interval a ^ x < oo. 
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To prove the test we introduce the functions / + (x) and f~ (x} 
Avhich are the “positive and the negative parts” of the function / (a;): 

(f ( x ) for x suchNthat / (x) ^ 0 
/ ( x ) — j 0 f or x sue};, that / (x) < 0 

and 

JO for a: such that / (x) 0 

^ ^ ( | / ( x ) | for x such that / (x) < 0 

These functions are shown in Fig. 266. We can write 

/ (x) = r (*) - f~ (*) (55) 

and 

\t(x)\=r & + /- (x) 

If condition (54) is fulfilled, the area shaded in Fig. 266d is finite. 
Hence, the areas shaded in Fig. 2665 and c are also finite. The ine- 
qualities / + (x) ^ 0 and f~ (x) ^ 0 holding, the improper integrals 



of these functions converge. Now, by equality (55). integral (53) 
also converges which is what we set out to prove. More precisely. 


CC DO CO 

/ - ^ / + (x) dx — J f~ (x) dx 


,..T'-.,e a n happen t 
i* integral 

ce. In 
case 
■ >e ti 
- the 


;rges, that is it equals infinity, 
is the so-called conditional 
tverges hut not absolutely. 
"6 b and c are infinite but 
we take into account 
i” which leads to 
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For example, the integral 


sin x 


dx 


(56), 


is absolutely convergent because 


I sinx 
x 2 



and therefore the integral of the absolute value of the integrand' 
can be compared with integral (51) for p — 2. 

If we perform the same operation on the integral 


, (57) 


we shall arrive at integral (51) with p = 1 which diverges. Therefore 
the comparison test does not apply here [see inequality (49) and 
think why the test is inapplicable]. At the same time it is possible 
to prove that 



dx= co 


because the first factor under the integral sign oscillates about the 
positive mean value. But it turns out that integral (57) nevertheless 
converges. To prove this let us perform integration by parts denoting 
1 1 

— = u, du — ^ dx , and sin x dx = dv, v = — cos x: 


( ^dx = 


COS X 
x 



cos x 
x 2 


dx — 


cos 1 — 


f cos x , 

J ~!T dx 


The last integral is of the same type as (56) and therefore it con- 
verges. Hence, the original integral is also convergent. In other 
words, integral (57) converges conditionally but not absolutely. 

The above results are automatically extended to integrals of 
complex functions depending on a real argument (see Sec. VIII. 6) 
and to integrals of the form 

b 

j f{x)dx (58)- 


which are defined by the relation 



i 
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integral can be carried out in the following way: 

b a b 

J / (x) dx = J / (x) dx -}- j / (a:) dx — 
a a a 

a 6-e 

= lim \ / (x) dx-\- lim \ f(x)dx = 

E ->+° e-*+Q J 

= lim [F(a) — F(a + e)]+ lim [F (b — e) — F( a)] = 

E-H-0 E-i-fO 

= F(b-0)-F(a + 0) 

Thus, in this case w r e can use our usual formula (11) for evaluating 
the definite integral, and if the substitution of the limits of inte- 
gration for the argument of the function yields finite results the 
integral is convergent. 

We now suppose that integral (61) has a singularity lying inside 
the interval of integration, for instance, at the point x — c. If an 
antiderivative F (x) is known then we have 

b c b 

| / (x) dx — j / {x) dx + j / (x) dx—[F(c — 0) — F (a)] -f- 

a a c 

+ [F(6)-/’(c + 0)l = F(6)- J P(a) + [F(c-0)-F(c-f0)I (63) 

Consequently, if the antiderivative has no discontinuities, i.e. 
if F (c — 0) = F (c 4- 0), formula (11) can be applied to evaluating 
the integral. But if the antiderivative has jump discontinuities 
we must make the necessary corrections as -we have done in formula 
(63). Finally, if the antiderivative has discontinuities of more com- 
plicated types inside the interval of integration, in particular, if 
it approaches infinity at some points, then the integral diverges. 
The above rules apply to the case of an arbitrary (finite) number of 
singularities. 

2 

Take an example. The integral f dx can be evaluated as 

_J y z 

follows: 

2 i 2 _ 1 1 2 

\ |7g dx= j X 3 ax=\ 2 1 == -|-(2 3 — 1) = 0.881 

-i -i T 

2 

because in this case the antiderivative is proportional to x 3 , i.e. 
it is continuous, and therefore the integral converges. The last 
example in Sec. 4 was treated incorrectly because the antiderivative 
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— . 1. which we had there approached infinity on the interval of 
integration at the point x = 0 and therefore the integral was diver- 
gent. 

In evaluating improper integrals we widely use expansions, of 
their integrands into series of different kinds. If such an expansion 
yields a good approximation only near a singularity then the integral 
in question is broken into a sum of two integrals so that one of the 
integrals should be an improper integral taken over a subinterval 
containing the singularity and the other integral should be an 
ordinary (proper) integral. Then the first integral is evaluated by 
means of a series expansion whereas the second one is computed 
according to the methods of § 3. For example, to evaluate integral 
(52) we can perform the following operation: 


(■ dx 

j y^r+i - 

f 

j ys 3+1 


r dx 

J yss+i 





, 3 “ 
+ 8** 



a 


u. 

=.( 


dx 


y*3+i y a 7y a 7 52 y^ 3 


u 

y 


dx 




(64) 


Here, in expanding the integrand, we have utilized Newton’s second 
binomial formula (IY.60). The number a > 0 entering into (64) 
can be chosen arbitrarily. If we take very large a we shall encounter 


a 

difficulties in evaluating the integral \ — but if we take 

J l/x 3 + l 


a very small a then the terms of the series whose sum is denoted 
by S will be very large, and it will be difficult to evaluate the sum. 
Let^us take a = 2 and evaluate the last integral by means of Simp- 
son’s formula (see Sec. 13). We divide the interval of inte- 
gration into eight parts. This yields 

2 


f 

J y* 3 +i 


1.402 


Calculating S with the same accuracy we obtain S = 1.402. Hence, 
integral (52) is equal to 2.804 (let the reader verify all the calcu- 
lations!). 


30 * 
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When we deal with a divergent integral, for instance, of form 
(41), we can encounter the problem of characterizing the behaviour 
of its “finite part” (42), which approaches infinity as IV-*- co, in 
a more precise manner. The investigation can also be carried out 
by means of expansions into series. For example, 


N a N 2 1 

{ + f *“ 5 (1 

J f X 2 + 1 J 1 ' 


= + 3 ~...)dx = 

L f _£ o _Ii 

= 3iV3 + C+-|-A r *~JV s + ... 


where C — [ ~ = — — a 3 la 3 -f- ff- a 3 — ...is a con- 

J f x 2 +1 o 38 

stant which can be calculated by means of the technique applied 
to the previous example. 

An analogous problem can arise when we investigate a convergent 
integral. In this case integral (41) tends to a finite limit and it is 
the rate of its variation in the process of approaching the limit that 
can be investigated by means of expansions into series. For example, 
reasoning as we did in performing calculations (64), we obtain 


N 


dx 


dx 


dx 


J y*8-i 


~[/x 3 -\- 1 


N 


v~* 


= 2.804- 


yW ' r\/ n- 


Integrating by parts we find 


j e~ x -dx= j e~ xZ dx — j -^- 2 xe- x2 dx = 
o o N 

] e - x 2 dX + ie - x 2 \%+ i ] e - i- dX =J 


e~ x 2 dx — 


2N 


e -^ 2 -J-a quantity of the order of 

f\J6 j 


The above estimation can also he easily proved with the help 
of L’Hospital’s rule which we leave to the reader. To specify the 
expansion we can perform repeated integration by parts. 

17. Gamma Function. As an important example of an improper 
integral.letjis consider the non-elementary gamma function intro- 
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duced by Euler in 1729: 

CO 

r (p) = j e- x x v ~i dx (65) 

o 

This expression is also called Euler’s integral of the second kind. 

Integral (65) is improper since it has infinity as its upper limit 
of integration and, besides, it has a singularity at x — 0 for p < 1. 
We know that e~ x tends to zero, asi-> oo, faster than any negative 
power of x and therefore the behaviour of the integrand at x — oo 
does not affect the convergence or divergence of the integral. On the 

A 

other hand, we have e~ x x v - x ~ for x -> 0 and therefore, by 

the beginning of Sec. 16, integral (65) converges for 1 — p < 1, 
i.e. for p > 0, and diverges for 1 — p ^ 1, i.e. for p 0. Therefore 
we shall consider formula (65) for 0 < p < oo. 

To deduce the basic property 

r (p + 1) = pT (p) (66) 

of the gamma function we integrate by parts: 


oo w oo 

r (p + 1) = j e'" v x <r+1> ~ 1 dx = j e~ x x p dx = - e~ x x v |“+ j e~ x px v ~ l dx 


which implies (66). 
Further, we readily find 


oo 

r (1) = J e^x 1 - 1 dx = j e~ x dx = — e" x |~ = 1 


If we now substitute, in succession, p = 1, 2, 3, . . . into formula 
(66) we obtain 

r (2) = i -r (i) = 1 , r (3) = 2r (2) - 2 - 1 , 
r (4) = 3r (3) = 3 . 2.1 etc. 

Generally, 

r (n + 1) = nl (n = 1, 2, 3, . . .) (67) 

Thus we see that the gamma function yields a representation of 
the factorial function. At the same time the gamma function also 
assumes certain values for non-integral values of the argument and 
therefore it extends the factorial function (see Sec. 1.15) from discrete 
values of the argument to the continuous range of the argument. 
The graph of the function is shown in Fig. 268. The equality 
r (+0) = +oo suggested by formula (66) is also illustrated in 
the figure. 
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In particular, formula (67) implies that 

oi = r (4) = i 

Further, we have 

(-4-) ! = r (-4-+0= r (4-)=i^= 1772 

We shall establish the last relation T j ) — V n in Sec. 18 [see 
formula (71)]. It follows that 


(4) ! “ r (4)=4- r (4-)“T=»- 88 «. 


(i) ! = r (4)=4 r (4) 


3 vs 


1.329 etc. 


The gamma function can also be defined for the negative values of 
the argument but it is impossible to use formula (65) for this pur- 
pose since the integral diver- 


ges for p < 0. However, we 
can rewrite formula (66) in 
the form 

v r (p — 1) 

P 

and use it for defining the 
gamma function for nega- 
tive p. 


rm 


r (/>) = 


(68) 





a 




-2 


nv 


-/ 




-4 


l P 


Fig. 269 


Indeed, if — 1 < p <; 0 then 0 < p + 1 < 1 and therefore the 
right-hand side of (68) makes sense for — 1 < p <; 0. Hence, for- 
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mula (68) defines the values of the gamma function T (p) for such p. 
It should be noted that in this way we obtain F (p) <C 0 for —1 < 
Cp <0. Further, if we take — 2 ■< p < — 1 then — 1 < p + 1 <1 0 
and therefore the right-hand side is defined for these p by the prece- 
ding extension of the gamma function into the interval — 1 < 
< p < 0. By the way, we see that F (p) i> 0 for — 2 < p < — 1. 
We next define F (p) for — 3 C p < —2 in the same way etc. Hence, 
F (p ) has been defined for the values of p of arbitrary signs, and 
formula (66) holds for all p. Applying formula (68) we conclude, 
in succession, that T (0) = ±°°, F ( — 1) = ±°°> T ( — 2) = Hhco 
etc. The graph of the gamma function for the negative values of 
the argument is shown in Fig. 269. 

There are extensive tables of the gamma function. In particular, 
they can be found in [23]. 

18. Beta Function. The beta function, or Euler’s integral of the 
first kind, is defined by the formula 

i 

B (p, q) = | a?" 1 (1 - a:) 9 " 1 dx (69) 

o 

Here we must have p > 0 and q > 0 since otherwise the behaviour 
of the integrand as x approaches the upper and the lower limits of 
integration yields the divergence of the integral (why is it so?). 
It should be noticed that indefinite integral (69) is expressible in 
terms of elementary functions only for certain specific combinations 
of the values of the exponents p and q ( see Sec. XIII. 9). 

As we shall show in Sec. XVI. 17, the beta function is expressed 
in terms of the gamma function by the formula 


B (p, q ) : 


£ (p) r (<?) 
r (p+9) 


(70) 


The formula suggests an interesting corollary: if we put 

P = q — y we obtain 

r2 (t) =B (t> y) = \ x 2 d — *) 2 dx = 

o 

i l 

r o r 

" J = 2 J -vt-a-w = - arc sin - 2l > s = - 

Hence, the inequality T (p) > 0 holding for p > 0, we have 

r (y)=y^ (71) 
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From this we can deduce the value of an important integral of the 

oo 

form j e~ x2 dx by using the substitution x = t: 
o 

j e- * 2 dx = y j e~ l f *dt = y j ~ 1 dt = +■ T ( 1 ) = -^L (72) 

0 0 0 

Many definite integrals containing power and exponential func- 
tions which cannot be expressed in terms of elementary functions 
for arbitrary values of parameters involved can often be expressed 
in terms of the beta function and, hence, in terms of the gamma func- 
tion. For instance, we have 


p i 


j sin p x dx = j y t 2 2 (1 — t) 2 dt = 
o o 

r ( p + l \ 

-| B ( £ y i 4)“ V 2" tt { £or p>~ i 

r (y+0 

(we have used the substitution sin x — Y t here), 


? xV ~ x fj r -- f « p ~ 1 ( l ~y) p * q d 'J 

J (l + ^) p+ ? J (l-y)P- 1 (1 — i 


y)" 


= j y 7 ' 1 (1 — yf ' 1 dy = B (p, q) for p>0, g> 0 (73) 


where x = ~—- Further, we have 
1 —y 

l , 

oo oo 1 

r dx 1 r yP , _ 

.1 (1+xP)? p J (1 + ^)5°^ 

0 0 

= 7 B (j >7 — 7 ) forp>0, qp>i (74) 

1 

where x — yv [we have utilized formula (73) here]. In particular, 
(74) yields the value of integral (52): 


dx 


y 1+ 


4 b (t.4-4)-t b (t. v)- 


1 r ! 

f!) 

[3j 

• r 

(t) 

^ 2.676X5.566 

3 

r 

(1) 



2.804 


(the values of the gamma function have been taken from the tables). 
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19. Principal Value of Divergent Integral. It is sometimes advi- 
sable to attribute certain numerical values to some divergent inte- 
grals, in a conditional sense. For instance, such a situation may occur 
in investigating physical processes in continuous media. There are 
different ways of employing these values. Let us consider Cauchy’s 
method. Suppose an integral of the form 

b 

[ / (x) dx (75) 


has only one singularity inside the interval of integration, say at 
the point x — c. We cut off a subinterval containing the singularity 
which is symmetric with respect to c. Then we pass to the limit and 
put, by definition, 


b c— e 6 

v.p. \f(x)d,x= lim P f f(x)dx- f- \ f (x) dx\ (76) 

J e-*4-0 L J J J 

a a c-f-8 

Such a limit can exist even if integral (75) is divergent in the 
ordinary sense of definition given in Sec. 16. If the limit exists we 
call (76) the principal value (Cauchy’s principal value) of (75). The 
notation v.p. in (76) is the abbreviation of the French valeur 
principale principal value. An integral of this type is often called 1 
a singular integral to distinguish it from proper and convergent im- 
proper integrals which are called regular integrals. Similarly, the 
principal value of an improper integral taken over the whole axis 
of the argument of the integrand is introduced, by definition, as 

oo N 

v.p. f /(x)dx=lim f f (x) dx 

v oo v 

-oo -JV 

2 

For instance, the integral j ~ dx is divergent because the~anti- 

derivative In | x | of the integrand has a discontinuity at the point 
® = P belonging to the interval of integration at which it approaches 
infinity. At the same time its principal value 


o -E 2 

v.p. = 

“ 1 -1 e 

= lim [In | a; | |l i -j- In | x | ] = 

e-»— po 

= lim t'lne— In 1+ In 2— In e] = In 2 = 0.693 

E-++0 
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exists because the summands ± In e mutually cancel out before we 

oo 

pass to the limit. Another example is the integral | sinxdz: 

— co 

CO N 

v.p. \ sin x dx = lim \ sin xdx — 

J JV-vco J 

— OO — i\ 

= lim ( — cos a;) |‘' w = lim [ — cos N -f- cos W] = 0 

N-*oo N-+CO 

Thus, the principal value of the integral exists although the inte- 
gral diverges, that is it is singular. 

It is apparent that not all divergent integrals possess principal 
values. 


§ 5. Integrals Dependent on Parameters 
20. Proper Integrals. Take an integral of the form 

b 

/= j f(x, X) dx (77) 

a 

whose integrand depends on a parameter (arbitrary constant) X 
besides the variable of integration x. The parameter X is regarded 
as constant in the process of integration but generally it can assume 
different values for which integral (77) is evaluated. And, generally 
speaking, the result of the integration can also depend on X, i.e. 
/ = / (X). Such integrals occur in applications when an integrand 
can involve such parameters as masses, sizes etc. which are kept 
constant in the process of integration. For the sake of simplicity, 
we shall take integrands which only depend on one parameter 
although similar results are obtained in the case of many parameters. 
We first consider several formal examples: 

i 

j {x-+Xx)dx = ^-\-^-, 

o 

it 

I j 1 — cos a j 

sm ax dx — and 

a 

o 

1 

j (s + 1) x s dx— 1 (s> — 1) 

o 

In^this section we shall take proper integrals of form (77), that 
is the limits of integration and the integrand will be finite. 
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Let us consider some properties of these integrals. 

1. If the integrand is a continuous function of X for a < x 
then the integral I is also continuous in X. For example, this is 
implied by the geometric meaning of an integral as the area of 
a curvilinear trapezoid: if an infinitesimal variation of X yields an 
infinitesimal change of the form and of the sizes of the curvilinear 
side of the trapezoid (which is the graph of the integrand) then the 
area should also gain an infinitesimal increment. 

It should he remarked that at the same time the function / may 
not be continuous in x and may have finite discontinuities. 

We sometimes encounter integrals whose limits of integration 
can also depend on a parameter: 

b(>. ) 

I (X) = f / (x, X) dx (78) 

a(>.) 

Then, for J (X) to he continuous, it is sufficient to set the additional 
condition that the functions a (X) and b (X) have no discontinui- 
ties. 

2. The Leibniz formula 

b 

^-=^f>.{x,X)dx (79) 

a 

suggests that it is permissible to differentiate integral (77) with 
respect to the parameter under the integral sign. The matter is that 
integral (77) is analogous to a sum of a great number of very small 
summands (see Sec. 2), each of them depending on X. Since the term- 
wise differentiation of a sum is permissible because the derivative 
of a sum equals the sum of the derivatives, formula (7S) can be 
deduced by passing to the limit. 

Here we understand formula (79) in the simplest sense, namely, 
we suppose that integral (77) is proper and integral (79) is proper 
or improper but convergent. But there are cases when integral (79) 
diverges. Then formula (79) remains true provided we understand 
it in a generalized sense which will be- discussed in Sec. 27. 

When differentiating integral (78) we must take into account that 
X enters into the formula as a parameter on which the integrand 
depends and as a variable on which both upper and lower limits 
of integration depend. Therefore we must use the formula for the 
derivative of. a composite function (see Sec. IX. 12) and formulas for 
the derivatives of an integral with respect to its upper and lower 
limits (see Sec. 4). This implies 

dl 

-dk = j A (*, X)dx + f (b (X) ,X)b'(X)-f (a (X ) , X) a’ (X) (80) 
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3. When we integrate an integral dependent on a parameter with 
respect to the parameter, it is permissible to integrate the integrand 
under the sign of the integral in (77), that is 

P b P 

| I (X) dX = j j / (x, X) dX^ dx 

a a cl 

The assertion is justified in the same way as property 2. 

21. Improper Integrals. We shall take the case of integrals of 
the form 

CO 

/(?,)= j/(x, X)dx (81) 

a 

which have no singularities for finite x. Improper integrals of other 
types (see Sec. 16) dependent on parameters possess similar pro- 
perties. Obviously, we suppose that integral (81) converges. 
But contrary to Sec. 20, here we can have a situation when the 
dependence of an integral on a parameter % may not be continuous 
even if the function / is continuous in X which is a new fact for us. 
This is possible because even an infinitesimal variation of a function 
over an infinite interval of integration may lead to a finite variation 
of the value of the integral. 

For instance, we shall show in Sec. XVII. 32 that 


5 


sin x 
x 


dx — 


ji 

T 


It immediately follows that for we have 

i(X)= 

0 0 

(we have made the substitution Xx = s ). At the same time we have 
7 — 0 for X = 0 and, for we obtain 

i(X) = J ^ dx = - J - s A n ljil x -dx= — \ 

o o 

Hence, in this example I (X) has a jump discontinuity at X — 0. 
One can find it strange that I (X) is discontinuous because the value 
of the improper integral 

/(?.)= lim [ ^^-dx 

i\r-co J * 
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is obtained as the limit of the values of the corresponding proper 
integrals, and each of these proper integrals is continuous in %. 
The explanation is that the limit of a sequence of continuous func- 
tions may not be a continuous function, as fit will Ibe shown here. 
We now take the functions 

GW- 

0 

whose graphs are depicted in Fig. 270 for small values of N and 
for large values of N. These functions are continuous in X but the 



transition from the values which are close to — — to the ‘values 

which are close to ~ takes place on a small {interval of variation 

of % for large N. The greater N, the smaller the interval. Therefore 
m the limiting process, for N = oo, this transition takes place 

conthiSty 1 interVal ° f the Vaxis ’ that is there a PP^rs a dis- 

The possibility of the existence of such discontinuities complicates 
the investigation °f integrals of form (81) and, particSrlv it 

TWf 11 impossible . t0 apply directly properties 2 and 3 in Se^ 20 
herefore we sometimes have to replace integral (81) by an integral 
taken over a finite interval from c to N and then pass to the limit as 
"?* Neverthaiess > as we shall show, there exists an important 
P Sunn lar SUCh disc »ntinuities are impossible. ? 

Suppose that the function / <*, K) satisfies the conditions 


l/(^» ty|<9(z) (c<x<oc) and j <p (x) dx < oo (82) 
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for all the values of X in question where 9 (x) is a certain function. 
Then, by Sec. 15, integral (81) converges for all X. In this case we 
shall say that the convergence of the integral is regular. For instance, 
this is the case for the integral 


since 


J sin ).x , 
—#~ dx 


sin 

z 2 



oc 

and | - 4 r- dx = 1 < 00 
1 


We' can assert that if the integrand of integral (81) depends 
continuously on ?. in the case of regular convergence the integral I 
is continuous in X. In fact, the integral can be represented in the 
form 

N 00 

/ (x, X) dx -f- j / (x, X) dx 

N 




The first summand is a proper integral and it is therefore 'continuous 
in X. The second summand canjie estimated as 


w w 

| j / (x, X) dx < j | / ( x . 7.) | dx < j tp ( x ) dx 

N ■ N N 


and, by condition (82), this integral becomes small for all the values 
of X simultaneously, for sufficiently large N (see Sec. 15). The vari- 
ation of the whole sum I (X) corresponding to a small variation 
of X must therefore he small which means that I depends conti- 
nuously on X. 

The properties of regularly convergent integrals are completely 
analogous to those of proper integrals described in Sec. 20. 


§ 6. Line Integrals 

22. Line Integrals of the First Type. The third example in Sec. 1 
is an example of a line integral of this type. The general definition 
is formulated as follows. 

Suppose there is a curve ( L ) of finite length lying in space or in 
a plane. Let a quantity u be determined at each point of the curve. 
If we reckon the arc length s along the curve ( L ) from a certain 
point of the curve then we can regard u as a function of S', u — f (s). 
To form an integral sum we break up the curve (L) into small ele- 
mentary arcs. Let the points of division correspond to the values 

a = s 0 < -Sj < . . . < s„ = p 
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where s - a and s = P are, respectively, the values corresponding- 
to the ends of the curve (L), and n is the number of elementary arcs 
S < s ^ s k . Now we choose an arbitrary point a k on each ele- 
mentary arc (that is s h < o h < s h ) and form the integral sum 

n 

S / (°fc) = Sh - 1 

h=i 

To obtain the integral we must pass to the limit in the process 
when all the lengths of the elementary arcs are decreased unlimi- 
tedly (compare with Sec. 2), i.e. 

j u ds= j / (s) ds = j / (s) ds — lim ( 83 ) 

(L) (L) a h=l 

It is integral (83) in which the integration is performed with 
respect to the arc length that is called a line integral of the first 
type. For instance, in the above-mentioned example from Sec. 1 
we have 

M — j p ds 

d) 

Similarly, if a point describes a trajectory (L) and a force F is 
acting on the point in this motion (F can be variable in the general 
case) then the work performed by the force is (compare with the- 
corresponding example in Sec. 6) 

A = j F cos (F, t) ds 

d) 

where t is the unit vector in the direction of the tangent to (L). 

Hence, this work is a line integral of the function f — F cos (F, t) 
with respect to the arc length. 

Formula (83) indicates that a line integral of the first type is 
a modification of the ordinary definite integral, and therefore many 
properties of the definite integral (in particular, properties 2-5 
in Sec. 4 and those enumerated in Sec. 5) are automatically extended 
to the line integral. But at the same time it should be taken into 
account that As, and therefore ds, is always considered to be posi- 
tive which implies that when we pass from (83) to an ordinary definite 
integral we must perform integration from a smaller value of s 
to a larger value. Therefore property 1 in Sec. 4 makes no sense 
in the case of integrals with respect to the arc length. 

A quantity u can sometimes be defined over the whole space. 
For instance, it may be represented by a function u = f (x, y, z). 
Then integral (83) can be put down in the form 

I= j) /(*> y , z)ds 
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If a curve ( L ) is represented parametrically in the form x = x(t), 
y = y (t) and z = z (t) the evaluation of the corresponding integral 
■can be performed according to the formula 

e — - 

/= j f(x(t), y (t), z(t))V z+y 2 + z~di 
v 

where the values t = y and t = 8 correspond to the ends of the 
•curve ( L ). Here we have taken the expression of ds given in 
Sec. VII. 23. On the basis of Sec. VII.23, we can also write an inte- 
gral of the first type in the form | u | dr |. We also write it as 

fa 

j u (M) ds where M is the variable point of the curve ( L ). The cor- 
(1) 

responding integral sum can also be written in the form 

n n 

2 u&Asft ~ 2 u (Mh) As^ 

ti—i h= 1 

where Mh is a point belonging to the &th elementary arc. 

Let us take an example illustrating an application of the line 
integral of the first type. It is well known in mechanics that if we 
.are given a system of material points Mh {xh, yh) of masses mk 
lying in the x, y-plane where k = 1 , 2, . . ., n, then the coordi- 
nates of the centre of gravity of the system are defined by the for- 
mulas 

X m l X| + m 2 x 2 + • ■ . -f rn n x n _ nt)yi -j -m t y z + • • • + m n y n 

c m l + m 2 +...+m n ’ mi +m 2 + . . . -f-m„ 

Now suppose that we have a plane material line ( L ) with linear 
•density p which may be variable in the general case. The centre of 
gravity of the material curve can be found in the following way. 

Let us divide (mentally) the curve ( L ) into small elementary arcs 
A Sh and replace each of the arcs by the material point of mass mh = 
= pfcAsft lying on the arc. Thus we obtain a “discrete model" of 
the material line. The centre of gravity of the model has the coor- 
dinates 

n n 

2 z *PA A s & 2 VhPh& s h 

*-*•*=$ y^~ (84) 

2 2 pkAsh 

fe=i *= i 

Now if we pass to the limit, as the lengths of the elementary arcs 
are decreased unlimitedly, our model will turn into the continuous 
curve (L) whose centre of gravity, by formulas (84), will have the 

, ) 

j i 
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^ pa: ds 

\ PU ds 


(L) 

(L) 

7 / — . , : 

(85) 

p ds 

l P ds 

(L) 

(.L) 



If we take a curve with, constant linear density the formulas for 
the centre of gravity are simplified. Namely) cancelling out p = 
— const in formulas (85) we obtain the formulas for the so-called 
geometric centre of gravity of the curve ( L ): 

j x ds | y ds 

Xg.c.— T » Ug.c.— r 


( 86 ) 


where L designates the length of the curve {L). 

Let us compare the second formula (86) with formula (36) of the 
area of the surface of a solid of revolution. We see that 


S = 2n j y ds — L‘2ny g _c. 

d.) 

In other words, if a plane curve rotates about an axis lying in the 
plane of the curve and not intersecting it then the area of the surface 




of revolution thus obtained is equal to the product of the length 
of the curve by the distance passed by its geometric centre of gravity. 
This is Guldin’s first theorem named after the Swiss mathematician 
P. Guldin (1577-1643) who applied the theorem. For instance, the 
theorem implies that the surface area of a torus (see Fig. 271), obtai- 
ned by the rotation of a circle of radius r about an axis lying in the 

31—0141 
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same plane as the circle and not intersecting it, is equal to 
S — 2nr-2nR = 4n 2 rR 

where R is the distance from the centre of the circle to the axis 
of revolution. 

We shall come back to the notion of a line integral of the first 
type in Sec. XVI. 1. 

23. Line Integrals of the Second Type. There are line integrals 
taken with respect to coordinates, besides the integrals of the first 
type. When forming such an integral (called a line integral of the 
second type) we suppose that the curve ( L ) is directed, that is there 
is an indication in what direction the curve is described. If we have 
a non-closed curve we must indicate -which of its ends is regarded 
as its initial point and which as its terminal point. To define the 
integrals we write, instead of formula (83), the formulas 

n 

j udx = lim ^ / ( a h) A Xh, 

(L) h=l 

n n 

^ u dp = lim 2 /(ok)Ar/ ft and j udz = lim ^ /( Oh) Az k 

(E) h=l (E) fc=l 

where Ax h is the increment of the abscissa x along the Mb elemen- 
tary arc etc. 

Integrals of type (87) are readily reduced to ordinary definite 
integrals. For example, if the curve ( L ) is represented in a parametric 
form then the values of u assumed at the points of (L) become depen- 
dent on t, i.e. u becomes a function of t. Therefore 

6 

| u dx = j u(i) x ( t ) dt 
(E) v 

where the values t = y and t = 6 correspond to the ends of the 
curve (L). Consequently, the basic properties of the definite integral 
are extended to line integrals of the second type (properties 2-5 
in Sec. 4). But here property 1 in Sec. 4 also holds. It can he for- 
mulated as follows: if the direction of describing the curve ( L ) is 
reversed then integrals of type (87) are multiplied by —1. In fact, 
if the curve (L) is passed in the opposite direction then all Ax (and 
all dx) change their 'signs. The possibility of changing the sign 
of dx also indicates that the properties which are connected with 
integration of inequalities (see Sec. 5) no longer hold for the inte- 
grals of the second, type. For instance, an integral of the form (87) 
of a positive function may not be positive, in contrast to the integ- 
ral of the first type. 
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In the theory of differential equations (Sec. XV. 6) and in the 
theory of vector field (§ XVI. 6) we often use the combinations of 
ntegrals of type (87) of the form 

f ( u dx -^vdy + w dz) = f u dx + \ v dy + \ w dz 

(L) (i (£) d) 

where u, v and w are given functions of x, y and z. 

Some examples considered in Sec. 8 were virtually integrals of 
type (87). Indeed, formulas (29)-(31) can be rewritten as 

S— — j ydx — ^ xdy—~ J (xdy — ydx) (88) 

(L) (L) ~ (L) 

• • 

on the basis of formulas x dt — dx and y dt = dy. 

For our further aims we shall need the integral J y dx taken along 

(L) 

a closed plane curve arbitrarily placed in space, as in Fig. 272. 
To evaluate the integral we project the curve ( L ) on the xOy-plane 
and verify that 

^ y dx— j y dx (89) 

Indeed, the points of the curves (L) and (Z/) which correspond to 
each other differ only in the values 
of the coordinate z which does 
not affect integrals (89). 

By (88), the integral on the right- 
hand side of (89) is equal to 

j y dx= — S' 

d') 

Now we shall use a well-known pro- 
perty of projections: if a plane 
figure is projected on another plane 
then the area of the projection is 
equal to the product of the area of 
the initial figure by the cosine of Fig. 273 

the angle between the planes (see 

Fig. 2/3). Actually, in this case the sizes in one direction are 
multiplied by cos a whereas the sizes in the perpendicular direc- 
tion do not change. 

Thus, we have 
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where S is the area bounded by the curve ( L ) and n is the unit vector 
in the direction of the outer normal to the plane of the curve (L). 
The direction of the vector n corresponds to the rule of descri- 
bing ( L ) according to the right-hand screw rule (see Sec. VII. 11). 
The formula 

j xdy — S cos (n , x) (91) 

(G) 

is proved in a similar way. 

Performing circular permutation of the coordinate axes (set 
Sec. VII. 12) we deduce, from formulas (90) and (91), the formula: 


j zdy — — S cos (n, x), 

f xdz= — S cos (n, y), 

(G) 

(L) > 

f y dz = S cos (n, x) and 

\ zdx — S cos (n, y) 

<G) 

J 

(G) ' 


(92 


We note in conclusion that integrals of the forms 
<^> } (x) dx, <^> cp ( y ) dy and i]i (z) dz 

taken along any closed curve ( L ) are always equal to zero (the nota 
tion t^) is used for designating integrals taken along closed contours) 

Actually, let F (x) be an antiderivative of / {x). Then the first of th 
above integrals is equal to the increment of the function F ( x ) gaine 
as the variable point describes ( L ) and returns to its original posi 
tion. The integral is therefore equal to zero. The second and th 
third integrals are treated similarly. 

24. Conditions for a Line Integral of the Second Type to Be Indf 
jiendent of the Path of Integration. Consider an integral of the fori 

I— j \P (*, y, z)dx + Q ( x , y, z)dy + R (x, y, z ) dz] (91 

(G) 

where P, Q and R are some functions defined over the whole spac 
with the coordinates x, y and z or in a domain lying in the spaci 
Let ( L ) be an arbitrary curve lying inside the domain. We sha 
also suppose that P, Q and R are finite in the domain, i.e. they ai 
bounded and do not approach infinity at any point. There are sue 
physical problems in which we encounter integrals (93) that onl 
depend on the positions of the initial and terminal points of tl 
curve (L) but not on the form of (L). This means that these integra 
are independent of the way in which (L) connects the initial poii 
with the terminal point. In other words, in these cases we ha^ 
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(Fig. 274) 

f [P dx + Q R dz) — ^ (P dx-\- Qdy + R dz) — 

(£') < £ "> 

= \ (Pdx+Qdy + Rdz)= ... (94) 

(L"0 

for any fixed points ,4 and 5 and for all the possible curves L' , L" , 
]j" . connecting A and B (where L , L , L 1 , ... lie inside the 

domain). We shall say that integral (93) is independent of the 
path of integration. Such an integral can express, for instance. 



Fig. 274 Fig. 275 


the work of a field of force as a point is moving. Then condition (94) 
means that the work only depends on the positions of the initial 
and terminal points. 

For integral (93) to he independent of the path of integration, 
it is necessary and sufficient that integral (93) taken round any 
closed contour should be equal to zero, that is 

(^> (P dx + Q dy -f- R dz) = 0 (95) 

U) 

for any closed contour ( L ). 

To prove the assertion we shall first suppose that condition (95) 
is fulfilled and that we are given paths ( L' ) and (L") with the same 
initial and terminal points (see Fig. 274). Let us construct the closed 
contour (L) traced from A to B along ( L ' ) and from B to A along 
(L"), the path (L") being described in the direction which is opposite 
to its original direction in the corresponding integral (94). Now we 
take condition (95), break up integral (95) into two integrals taken 
along the two paths, according to property 3 in Sec. 4, and change 
the direction of integration for the second integral with the simul- 
taneous change of its sign, according to property 1 in Sec. 4. This 
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yields 

0=(^)= J — ^ , i.e. [ (P dx-\-Q dy-\-.R dz) — 

(L) (L<) (i--) (L*) 

= j (P dx -j- Q dy -f R dz) 

(.LT) 

which implies (94). Reversing the preceding argument we can easily 
deduce (95) from (94). 

. For integral (93) to be independent of the path of integration, it 
is necessary and sufficient that the element of integration should 
be the total differential of some single-valued function of three 
variables, that is 

P dx -f- Q dy -}- R dz = du (96) 

where u — u ( x , y, z) is a certain function. To prove the assertion 
we first suppose that condition (96) holds. Then 

| (P dx -f Q dy — R dz) = j du—u(B) — u(A) 

(L) ■ (L) 

where A and B are, respectively, the initial and the terminal points 
of the curve ( L ). Consequently! the integral is independent of the 
path of integration. J 

Conversely, let integral (93)’, be independent of the path of inte- 
gration. We fix an arbitrary point M 0 in space and introduce the 
following function of the variable point M ( x , y, z ): 

u(M)= j ■ (P dx-)- Q dy -f- R dz) (97) 

U-Uo-M 

where u M oil I is an arbitrary curve connecting M 0 and ilf and direc- 
ted from Mo to M. Condition (94) suggests that the function is single- 
valued. that is it assumes a certain uniquely defined value at each 
point M. To find du we first give x an infinitesimal increment dx. 
Then the point M passes to the position M' ■ as in Fig. 275, and the 
infinitesimal line segment MM' is parallel to the x-axis. The corres- 
ponding increment of the function u is equal to 

A x ii = u (M’) — u(M)= J (Pdx^-Qdy-^Rdz) — 

— [ (P dx -[- Q dy ~ R dz) = [ (P dx -j- Q dy -j- R dz) 
u-’fo-U ilir 

But the coordinates y and z not varying along the segment MM', 
we have dy = dz = 0 in the last integral. Therefore A x u = f P dx. 
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The segment MM' being infinitesimal, we can consider P to be 
constant on it {P = const) to within infinitesimals of higher 
order of smallness. Therefore, passing from the increment to the 
differential and thus dropping infinitesimals of higher order we 
obtain d x ii = P dx. Similarly, we find that d v u = dy and 
d z u = R dz. Adding up the results we get (see Sec. IX. 11) 

du = P dx + Q dy + R dz 

and consequently condition (96) is fulfilled. 

If we recall expression (IX. 7) of the total differential we can write 
condition (96) in the equivalent form 

il_p d -± = Q —z=R (98) 

It follows that if integral (93) is independent of the path of inte- 
gration then we have 

dy dx 1 dz dx 1 dz dy ' 

Indeed, conditions (98) imply 

dP __ d-a dQ __ (f~u 
dy dx dy ’ dx dy dx 

and therefore, based on the independence of the mixed derivatives 
of the order of differentiation (see Sec. IX. 15), we obtain the first 
equality (99). The other conditions (99) are proved similarly (let 
the reader complete the proof!). 

In the theory of vector field (Sec. XVI. 27) we shall prove the 
converse assertion: if conditions (99) are fulfilled and if the domain 
in which the functions P , Q and R are considered is simply-connected 
then integral (93) is independent of the path of integration. A domain 
is said to be simply-connected if any closed contour lying in it can 
be contracted to a point by means of a continuous deformation 
without falling outside the domain. 

The whole space, a half-space, a dihedral or a polyhedral angle, 
the interior or the exterior of a sphere, the interior of a finite or 
infinite circular cylinder are examples of simply-connected domains. 
In contrast to it, the exterior of an infinite circular cylinder is 
a doubly-connected domain. Indeed, if we consider the contour ( L ) 
depicted in Fig. 276 we see that it cannot be contracted to a point 
without falling outside the domain. Further examples of multiply- 
connected domains are the interior or the exterior of a torus and 
the whole space from which all the points belonging to a circle or 
to an infinite straight line are removed.’ In Fig. 277 we see 
a plane simply-connected domain and a plane four-connected 
domain. 
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Comparing conditions (96) and (99) we see that conditions (99) 
are necessary and sufficient for the expression P dx + Q dy -f- R dz 
to be a total differential of a single-valued function u ( x , y , z) 
defined in a simply-connected domain. It is possible to show that 
if conditions (99) are fulfilled in a multiply-connected domain then 



Fig. 276 Fig. 277 


the function u (x, y, z) constructed in accordance with formula (97) 
satisfies condition (96) but may not be single-valued in the general 
case. 

An integral of the form 

j [P (*, y ) dx + Q (x, y) dy] 

taken along a plane curve is considered similarly. Accordingly, 
in all the expressions we must drop the terms containing z and dz. 
In particular, conditions (99) turn into one condition 

OP _ OQ 
dy dx 

for such an integral. 

§ 7. The Concept of Generalized Function 

25. Delta Function. The Dirac delta function which is widely 
used in mathematics and its applications is the simplest example 
of a generalized function. The theory of generalized functions was 
founded in 1936 by the Soviet mathematician S. L. Sobolev. It was 
thoroughly developed only during the last twenty years. 

To get an approximate representation of the delta function we 
first consider the discontinuous function defined by the equalities 

0 for -oo<x< — -^y- 

y = 5 jY (x)= ■ N for ~ W < * < ITf ( 10 °) 

0 for < x < oo 
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where N is a very large positive number. The graph of the function 
is shown in Fig. 278. As usual, the values of the function at the 

points of discontinuity x — ± ^ are of no importance and therefore 

they are not indicated here. The delta function can he thought of 
as the limit of the function (100) as N oo. But, strictly speaking, 
such a limit should be defined as 


§ i x ) — 


0 for — ooCx< — 0 and + 0 < x < oo 


oo for — 0 < ^ < + 0 
with the additional condition 


j 6 (x) dx — 1 

— CO 

CO 

which is implied by the fact that j 8jv (^) dx — 1 because the 

— oo 

area shaded in Fig. 278 is equal to unity. We cannot therefore speak 
about the graph of a delta function. The last 
relation can also be rewritten in the form 


+0 

j 8x (dx) = 1 
-o 


( 101 ) 



An approximate representation of the delta 
function must not necessarily be obtained by 
means of the discontinuous function (100). 

For example, we can also take the function 

6pf(x) = — (1 -fiVV 1 ) -2 defined over the inter- 
val — oo < x < oo for this purpose (let the 
reader investigate the behaviour of the graph 
of this function as N oo) and so on. Gene- 
rally, we can take every function whose val- 
ues -are “concentrated” near x = 0. More precisely, for the delta 
function to be defined as the limit of the function 8jv (x) (understood 
in the above sense) it is sufficient that 8 w (a:) should satisfy 

the conditions 8^ (x) 0 for 


J_0\ f_ 
2N I ZN 

Fig. 278 


-oo < x < oo, 


J <5jv (x) dx 0, 

-b b a 

8n ( x ) dx —y 0 and j 8^(x) dx -h>- 1, as N — oo for any positive 


constant numbers a and b. 
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If we consider masses distributed along the a;-axis and their linear 
densities then the linear density of a material point of unit mass 
placed at the coordinate origin turns out to be equal to the delta 
function. In fact, if we first consider this mass to be uniformly dis- 

1 1 

tributed over the segment — ^ ^ x ^ 2N not concentra t e( i 

at the point x — 0 then its density -will be of the form depicted in 
Fig. 278 (why is it so?). Now if N co then the mass will be con- 
tracted to the point in the limiting process and its density will 
become the delta function. 

Similarly, the function p (x) = m.8 (x — a) is the linear density 
of a mass m concentrated at the point x — a (we say “linear density” 
because we regard the mass as distributed along a line, that is along 
the as-axis in our case). The density of a point charge or of a point 
force can be represented in a similar way and so on. 

It is sometimes necessary to add together a delta function and 
an ordinary function. For example, the sum 

p (x) = p 0 + m iS (x — + m z 8 (x — a^j 

is the density of the combination of a uniformly distributed mass 
and two discrete mass points. Hence, the use of the delta function 
makes it possible to apply formulas which were originally deduced 
for continuously distributed masses to any combinations of discrete 
mass points and distributed masses. Moreover, from this point of 
view the distinction between discrete mass points and continuously 
distributed masses becomes inessential. The same refers to charges, 
forces and so on. 

When integrating an expression containing delta functions one 
must apply formula (101). For instance, if f ( x ) is a continuous func- 
tion in the interval a < x < (3 and if a < a < p then 

ft a— 0 a-fO 

^/(z)5(;r — a)dx— ^ }{x) 8{x — a)dx J r f(x)6(x — a)dx- j- 

a a a — 0 

ft a-J-0 

-f- j f{x)8(x — a)dx = 0-f- j f{a)8(x — a)dx-\- 0 = 

a+0 a — 0 

= /(a)-l = /(a) (102) 

because the first and the third integrals on the right-hand side are 
equal to zero since the delta function is equal to zero on the corres- 
ponding intervals whereas / (a) substitutes for / (x) in the second 
integral since a continuous function can be regarded as being con- 
stant on an infinitesimal interval. Therefore formula (101) yields 
( 102 ). 
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We must indicate the two-fold sense of an integral of the form 

p 

I / (x) 8 (x — a) dx. We can speak of the value of the integral only 

if there is a specification as to whether the singularity of the delta 
function 8 (x — a) (i.e. the point x — a) is included into the range 
of integration or not. Consequently, we must write either 

e I 

j / (.t) 6(x — a) dx =f (a) or j / (x) 6 (x - a) dx = 0. 

a— 0 cc-J-0 , , . 

Integrating the Dirac delta function from — oo to x we obtain the 
step function (the Heaviside unit function) 


X 

e(x)= [ 8 (f) dt 

v 

— OO 


j 0 for — oo << x < 0 
1 1 for 0 < x < oo 


(103) 


which also has many applications. Its graph is shown in Fig. 279. 
Such a function can be applied to describing a process of an instan- 
taneous application of a constant action, for instance, the process 
which occurs when a constant voltage is instantaneously applied 
to the terminals of an electric circuit. 

Thus, the integration of the delta function results in an ordinary 
though discontinuous function. The repeated integration yields 
a continuous function (let the reader investigate the form of the 
function). 

If we differentiate equality (103) we obtain 

8 {x) = e' (x) (104) 


This equality should he understood in the generalized sense. For 
instance, we can first substitute the oblique segment connecting the 

points 0) and (- 5 ^. l] for the vertical segment in Fig. 279 

(this inclined segment is represented by the dotted line). Then the 
discontinuous function is replaced by a continuous function whose 
derivative has the graph of the form shown in Fig. 278. Now passing 
to the limit, as i\ r -y 00 , we obtain relation (104). 

Hence, we can say that a delta function is obtained by differen- 
tiating a discontinuous function having a finite jump. For example, 
let us take the law of motion considered in Sec. 1.13 which is con- 
nected with Fig. 6 . The corresponding velocity is expressed by 
the formula 



gt for 0 

v for t*Ct<o o 
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and it has the jump s't (t* + 0) — s[ (t* — 0) = gt* — v at the 
point t — t*. The acceleration is therefore equal to (gt* — v) 6 ( t — 
—t*) + ge (t* — t) (check it up!). The first term in -the sum de- 
scribes the impact phenomenon. 

26. Application to Constructing Influence Function. Constructing 
an influence function (which is also called Green’s function after 
the English mathematician G. Green, 1793-1841) is one of the impor- 
tant applications of the delta function. We begin with an example. 
Let us consider the deflection h (x) of a beam subjected to a trans- 
versal external load of intensity p ( x ) (see Fig. 280 where we see 


y 

i 

/ 

/ 

y=e(x) 

-/ 

/ 

0. 

i 

2N 

1 X 

ZN 


Fig. 279 



the graph or, as it is called, the diagram of the external load). We 
shall suppose that the load is not very large and we can therefore 
apply the linear law of elasticity which implies that if we combine 
external forces the corresponding deflections add up. 

Let us imagine that we have unit force applied at a point | [whose 
“intensity” is 6 (a: — £)]. Then the beam will be deformed in a certain 
way. We denote the deflection at a point x under the action of unit 
force applied at the point % by y — G (x, £). It is the function 
G ( x , 1) that is called the influence function of our problem. We 
shall show that if the function is known it is easy to determine 
the deflection under the action of an arbitrary load of intensity 
p(x). 

Indeed, let us consider the portion of the load on the infinitesimal 
segment of the axis from the point | to the point £ + d£. This load is 
equal to p (£) d|. Therefore the deflection under this load at a point 
x is equal to G ( x , |) p (|) because the linearity we have men- 
tioned above implies that if a load is multiplied by a constant the 
corresponding deflection is multiplied by the same constant. Adding 
together all these infinitesimal deflections we obtain the resultant 
eflection (see Fig. 280): 

v i 

h (x) — j G (x, c) p (c) dc 
o 


( 105 ) 
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Now we proceed to describe the general scheme for constructing 
influence functions. Let an external action be applied to an object 
and let it be described by a function / (x) defined over an interval 
a < x < b (the role of / (x) was played _by the function p ( x ) in 
our previous example). Let a function f (x), a^x < b, describe 
the result of the action (such a result was described by the function 
h ( x ) in the example). Thus, every given function / is transformed 
into a new function f. By Sec. XI. 6, such a law of transformation 
of a preimage which is the function / into the image which is the 
function / is called an operator. For instance, the operator of differen- 
tiation D transforms functions according to the law D / = /' and 
thus we have D (sin x) = cos x, D (x?) = Sx 2 etc. Here sin x is 
a preimage which is transformed by the operator D into the image 
cos x etc. The concept of an operator is analogous to the concept 
of a function (see Sec. 1. 11) but a function transforms numbers into 
numbers (i.e. the values of an argument into the corresponding values 
of the function) whereas an operator transforms functions into 
functions (or, generally, objects of any kind into objects of the 
same kind or of another kind). 

Let us denote the operator which transforms a function / (x) 
describing an external action into the “response” function / (x) 
by A, that is A / = /. We shall suppose that there is a linearity law 
here or, as we say, the superposition principle: when external actions 
are added together their results are also added up. This law which 
can be written in the form 

A (/, + /*) = Afi + Ah 

is often applied when the external actions are not very large. An 
operator possessing this property is said to be a linear operator 
(compare with Sec. XI. 6). (Let the reader verify that the operator D 
is linear.) From the law of linearity we can deduce that if an external 
action is multiplied by a constant the corresponding result is also 
multiplied by the constant, that is 

A (Cf) = CAf 

where C = const. (Try to justify this rule first by taking positive 
integral C and then by putting C = y , • • •> C = - ~ where 

m and n are integers, C = 0 and, finally, by passing to negative C .) 

-i n T ow ,^ enote as £ (x, £) the result of an external action des- 
cribed by the delta function 6 {x — D (regarded as a function of x 
for every fixed value of c), that is 

A 18 (x - I)] = G (*, l) 
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Any function f(x) can be represented as a sum of “impulse functions” 
of the form depicted in Fig. 281 below (the corresponding summation 
is depicted in Fig. 281 above). Each of these functions has a singu- 
larity only at one point (when we pass to the limit as dl 0) and 
is therefore equal to / (t) di 8 (x — c) (why is it so?). Thus we have 

/(z) = 2/d) d S 6 0* : — i) 

This is in fact formula (102) written in another form. It follows that 
A [/(*)] - A [ 2 /( 1 ) dlb(x-l)] = 2 A [/ (|) dl8 (*-!)] = 

= 2/(1) dl A [6 (*-§] = 2 / © dl G (x, g) 

But when dl is infinitesimal this sum turns into an integral, and 
finally we obtain 

b 

A [/(*)]= \G(x. l)f(l)dl (106) 

a 

[compare with formula (105)1. 

An influence function can be determined theoretically in simpler 
cases (for instance, see the end of Sec. XV. 16). In more complicated 

cases it can be found experimen- 
tally by performing necessary 
measurements (for example, we 
can measure the deformation of 
an elastic system caused by the 
action of unit force). The essen- 
tial thing is to verify the linea- 
rity of the system in question, 
that is to find out whether the 
superposition principle is appli- 
cable. The applicability can be 
deduced theoretically or confir- 
med' by an experiment. Of course, 
not all systems are linear or even 
approximately linear. There are 
such systems that are essentially 
non-linear, and linear methods 
are inapplicable to such systems. 

It should be noted that the 
functions / and / can be defined 
over different intervals. Moreo- 
ver, the independent variables x 
and | entering into formula (106) 
may have different physical meanings. The independent variable | 
\is sometimes interpreted as time. In this case the influence function 
describes the result of applying unit impulse at the moment g. 




Fig. 281 
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;t sometimes happens that a system in question is linear only 
infinitesimal external actions. But an action whose density is 
bribed by a delta function cannot be considered to be small, 
en the influence function can be defined by the formula 


G ( x , t) = lim 4- A [PS (x— £)] 

c instance, in our previous example we can find the deflection 
■responding to' a small force P and then divide the result by P.' 
such circumstances formula (106) is applicable only to the case 
small (or, more precisely, infinitesimal) external actions f (x). 
17. Other Generalized Functions. We now take the approximate 
iresentation S A - (x) of the delta function given in Sec. 25 for 
;ontinuous model and differentiate it. We obtain an approximate 
iresentation of the derivative 6' {x). It can be seen that 8' ( x ) 
ces on the values of both signs and has singularities that are 
ore acute” than those of the delta function 8 (a:). 

We have shown that the 8-function describes the density of 
it charge located at the origin (see Sec. 25). Its derivative 
(x) describes the density of a dipole placed at the same point, 
fact, we can obtain such a dipole if we put the charges — q and q 
the points x — 0 and x = l, respectively, and then pass to the 
ait, as l tends to zero, retaining the constant value of the quantity 
= ql (the dipole moment). Thus we obtain two infinitely large 
arges of opposite signs with an infinitesimal distance between 
im. Before the passage to the limit the density of the charges has 
3 form 

q 8 (x-l) — q 8 (x)=—p 

d therefore, after the passage to the limit for l — >• 0, the density 
comes equal to — p8' ( x ). 

Integrals involving 8' (x) are evaluated with the help of integra- 
an by parts. For instance, if / (x) has a continuous derivative and 
cc < a < p then 


/ (x) 8' (x — a) dx = f (x) 8 (x — a) 


.-c=(5 

x=a 


j 8 (x — a) f (x) dx — —f (a) 


ere we also give an example of incorrect calculations: 

f a r° 

) / (*) 5' (x — a) dx == j / (x) 8' (x - a) dx = 


a-}-0 


a— 0 


— / (ffi) j 8' (x — a) dx — f (a) 6 (x — a) 


a-0 


K=a+0 
x=a— 0 


= 0 


tus is wrong because 8' (a; — a) has a very “acute” singularity and 
terefore the substitution of / {a) for / (x) results in an approximation 
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which is not sufficiently accurate. The correct evaluation of the 
integral must be performed as above. 

Generalized functions can be classified according to the number 
of successive integrations which must he performed in order to 
obtain a continuous function. If we assume this classification then 
continuous functions can be regarded as generalized functions of 
zero order; functions with finite jump discontinuities and ordinary 
functions with integrable singularities placed at- finite distances 
(for example, from the origin) described in Sec. 16 are generalized 
functions of the first order. The function 6 (a;) is the simplest example 
of a generalized function of the second order (see Sec. 25) whereas 
6' ( x ) is of the third order and so on. The differentiation of a function 
increases the order by unity and the integration reduces it by unity. 

When a function with non-integrable singularities is interpreted 
as a generalized function we must indicate the function of order 0 
or 1 from which the function in question can be obtained by diffe- 
rentiation because there can be many such functions. For instance, 

the function — having a non-integrable singularity at x = 0 can be 
regarded, for x 0, as being equal to 

(In | x I)' or to (In ] x | + e (x))' (107) 

where e (x) is the step function (see Sec. 26) because we have e' ( x ) — 
= 0 for i^O. But functions (107) differ by e' ( x ) — 6 ( x ) and 
their properties are different. In the theory of generalized functions 

it is preferable to use form (107) for designating the function — 
because after one of these formulas has been chosen (or some other 
formula of this kind) the function — is represented in a unique manner 
as a generalized function [of course, formulas (107) yield different 
representations]. Similarly, the function — can be regarded as 

— (In \ x I)" and so on. Functions whose rate of growth at their points 
of discontinuity is greater than the rate of growth of any power 

l 

function (for instance, the function el x I) should be excluded from this 
classification. By the way, these functions are not of great impor- 
tance for applications. It should also be noted that the growth of 
a function for x ±°° is not restricted here. 

It turns out that the use of generalized functions makes it possible 
to extend most of the rules connected with differentiation of diffe- 
rent formulas without usual restrictions imposed on the behaviour 
of the functions. For instance, the Leibniz formula for differenti- 
ation of an integral with respect to a parameter (see Sec. 20) becomes 
true for any type of convergence of improper integrals in question 
and the like. 
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Differential Equations 


A differential equation is an equation connecting two or more func- 
tionally dependent variables and their differentials or , which is the 
same , 'their derivatives. The problem of forming and solving these 
equations is widely encountered in physics and engineering. The 
process of solving a differential equation is called integration of the 
differential equation. 

§ 1. General Notions 

1. Examples. We have already dealt with some simple differential 
equations in our course. For instance, take equations (XIY.22). 

If the force F (s) varies according to Newton’s law ^i.e. F ; 
see Sec. XIV.14J we can rewrite the equation in the form 

01 •£--*- m 

where the work A — A (s) is an unknown function of the displace- 
ment s. Equation (1) is a differential equation, and the unknown 
function A (s) is found by means of integration. 

Another example is equation (XIY.27) which can he rewritten as 

%=-frVWh < 2 , 

where h = h ( t ) is an unknown function. 

In addition, let us consider an example of elastic vibrations of 
a material point of mass M about an equilibrium position (see 
Fig. 282). Here the unknown function y = y (t) expresses the law 
of vibrations. For simplicity’s sake, let us suppose that there is 
a linear law of elasticity here, that is the elastic force is directly 
proportional to the deviation of the point from the equilibrium 
position. Then the force is equal to 

F = —ky 

32—0141 

) ' J , 


5 
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where k is the stiffness factor. If there are no other forces then, 
according to Newton’s second law, we have 

M-g - = F=-ky (3) 

Thus, the differential equation of the law of vibrations is of the form 

M-g-f ky = 0 (4) 


where y = y (/) is the unknown function. 

A differential equation of a problem in physics or engineering 
is always deduced on the basis of a certain law describing a rela- 
tionship between infinitesimal variations of quantities in question 



(a differential law). After the differen- 
tial equation has been integrated we get 
an integral law describing finite varia- 
tions of the quantities. The deduction 
of basic differential equations in a certain 
branch of science is a very important 
operation because it essentially determi- 


Fig. 282 nes the course of further development 

of this branch. 


2. Basic Definitions. A differential equation is usually taken 
in a form which connects an argument (or several arguments) and 
an unknown function (or several unknown functions) with its deri- 
vatives. Even if we originally have a relationship between diffe- 
rentials it is possible to transform it into a relationship between the 
derivatives [see formulas (1)]. If the unknown function in a diffe- 
rential equation depends on one variable the differential equation 
is called ordinary (for example, the equations in Sec. 1). If otherwise 
the equation is called a partial differential equation (think why it 
is called so). In this chapter we shall deal only with ordinary diffe- 
rential equations. 

The highest order of the derivative of the unknown fimction 
entering into an equation is called the order of the differential 
equation. Thus, equations (1) and (2) are first-order equations whe- 
reas equation (4) is a second-order equation. The general form of 
a differential equation of the nth order is 


F (x, y, y', y\ 


yhO) = 0 


where y — y {x) is the sought-for function. In particular cases 
a function F may not depend on some of the quantities entering 
into (5). For example, equation (4) does not contain the independent 
variable and the derivative of the first order. 

A function is called a solution of a differential equation if it redu- 
ces the equation to an identity when substituted into the equation. 
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Even the simplest examples indicate that a differential equation 
has infinitely many solutions. For instance, taking a simple equation 
of the form 


y' = 


y — v ( x ) 

we immediately find, hy integrating, that 


( 6 ) 


y=4 + C ( 7 ) 

This is the general solution of equation (6). It contains an arbi- 
trary constant C and is the set of solutions containing all solutions 
of the equation. Making the arbitrary constant assume concrete 
numerical values we obtain particular solutions of equation (6): 



X 3 . P 

i/= :- o - +6, y — -n- 


V2 


etc. 


If we take an nth-order equation of the form 
i/(") = a: 2 , y = y (a:) 

then its general solution can be found by means of n subsequent 
integrations and therefore it contains n arbitrary constants. Simi- 
larly, the general solution of an equation of the form (5) also con- 
tains n arbitrary constants, i.e. it has the form 

y — y (x, Co, . . >i C n ) (8) 


We often obtain the general solution in an implicit form 

® (x, y, Ci, C 2 , . . C n ) — 0 (9)' 

Relations (8) and (9) are also called general integrals of equation (5)- 
Particular solution can be obtained from (8) or (9) if we make each 
of the arbitrary constants C u Co, . . ., C n take on a certain con- 
crete numerical value. The graph of every particular solution is 
called an integral curve of the differential equation in question. 
Substituting these concrete numbers for the constants C u C 2 , . . . 
. . ., C n into equation (8) or (9) we get an equation of the integral 
curve. 

To isolate a unique particular solution from the general solution 
we must set some additional conditions. Such conditions are often 
taken as so-called initial conditions. If we have a process developing 
in time then such conditions are a mathematical expression of the 
initial state of the process. 

For example, take the process of vibrations considered in Sec. 1. 
The physical meaning of the problem makes it clear that a particular 
(concrete) vibration will be completely specified if we set the values 


32 * 
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ie initial deviation of the point from the equilibrium position 
of the initial velocity. Therefore the initial conditions for 
tion (3) are of the form 

ij = y 0 and -^- = u 0 for t=t 0 (10) 

e y 0 and v 0 are the given values. In the general case of an equa- 
of form (5) the initial conditions are 

Hoi y' = (y r ) o. • • •» z/ (n-1) = ( y (n ~ i} )o for * = x 0 (ll) 

e the values y 0 , (y ') 0 , . . ., (y^ n ~ l >) 0 are given. The general 
ion [for instance, of form (9)] containing n arbitrary constants, 
possible (at least theoretically) to determine the values of the 
lants taking advantage of the n relations we have set. Thus, 
rally speaking, the number of additional conditions (11) is 
dent for determining the arbitrary constants apd specifying 
jarticular solution. It appears natural, from the physical point 
iew, that if a differential law controlling the development of 
icess and an initial state of the process are given then the pro- 
itseit is completely specified. 

ndition (11) for an equation of the first order of form (6) means 
for a certain value x = x 0 we must assign a v a lue y = y 0 . 
instance, let it be necessary to isolate a solution for which 

= 2. Then (7) implies 2 + C, i.e. C — y. Hence, the 

lt-for particular solution has the form 

*3 + 5 

e problem of finding a particular solution of q differential 
tion when certain initial conditions are given is called the 
hy problem (initial-value problem). 

we shall see in Sec. 7, there are some diSerential equations 
h possess so-called singular solutions in addition to the parti- 
1 ones contained in the general solution, that is solutions that 
it enter into the general solution. 

§ 2. First-Order Differential Equations 

Geometric Meaning. The general form of a first-order diffe- 
al equation can be written as 

F {x, y, y') = 0 (12) 

e y = y (x) is the unknown function. For simplicity’s sake, 
rst suppose that the equation is solved for the derivative of the 
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unknown function. Then the equation takes the form 

y' = / fa u) ( 13 ) 

According to Sec. 2, the initial condition for equation (12) must 
be written as 

y — y 0 is given for x — x 0 (14) 

To get a geometric interpretation of equation (13) let us introduce 
a plane with Cartesian coordinates x and y. Then every particular 
solution is represented by a curve (the integral curve) lying in the 
x, p-plane, these curves being yet unknown. But if we take an 
arbitrary point M ( x , y) in the plane wn can compute the value 




of / (a:, y) which, as it is prescribed by equation (13), must equal 
the slope of the tangent (see Sec. IV.3) to the desired curve at the 
point M (x, y) provided the curve passes through the point M. 

We can therefore perform the follow T ing procedure: let us draw 
(mentally) a small line segment w r ith the slope tan a = / (x, y) 
at each point M (x, y) (see Fig. 283). Practically we can draw" only 
a number of such segments but theoretically we can regard the 
segments as drawn through all the points. Thus we obtain the so- 
called direction field in the plane defined by equation (13) (the gene- 
ral concept of a field was introduced in Sec. IX. 9). Hence, we see 
that the integral curves of equation (13) must pass through the 
points M (x, y) of the x, y-plane in such a wmy that each curve 
should touch the segment at each point it passes through. 

Thus, equation (13) defines a direction field in the x, p-plane. 
On the other hand, initial condition (14) defines a point M 0 ( x 0 , y 0 ) 
through w r hich the desired integral curve should pass. From the 
geometric point of view it is clear that the above condition com- 
pletely specifies the integral curve (see Fig. 284). In other words, 
initial condition (14) being given, equation (13) has a completely 
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specified unique solution. A more detailed investigation carried 
out by Cauchy shows that the above assertion holds provided the 
function / is continuous at the point M 0 and has a finite value of its 

derivative at the point (Cauchy’s conditions are in fact sufficient 

for the existence and uniqueness of the solution hut they are not 
necessary; certain cases when the above conditions of Cauchy's 
theorem are not fulfilled will be considered in Sec.- 7; as it will be 

seen, this may yield non- 
fulfillment of the unique- 
ness of a solution). 

These considerations 
concerning a direction 
field can be illustrated 
by means of the well- 
known experiment with 
iron filings placed in 
a magnetic field. The 
arrangement of the filings 
demonstrates a direction 
field whose integral cur- 
ves are the so-called mag- 
netic lines of force. 

The geometric mea- 
ning of equation (13) we 
have just discussed enab- 
les us to construct (ap- 
proximately) the integral curves of the equation. To do this we 
depict the directions of the field at as many points as possible and 
then draw the curves according to the directions. 

Practically, in constructing a direction field it is more convenient 
to take the points belonging to the so-called isoclines instead of 
choosing the points arbitrarily. An isocline is a locus of points 
at which the field has the same direction. We can derive an equation 
of an isocline if we equate the right-hand side of equation (13) to 
a constant. Thus we write 


V\ R 



Fig. 285 

PR iB the direction of the field 


/ (x, y) = k 

where k is the slope of the field corresponding to the isocline we 
have taken. 

For instance, let us take the equation y’ — x -f y- 
Equating the right-hand side to the constants —2, — — 1, — ~ , 

0, 1 and 2 we obtain the corresponding isoclines which, in this case, 
are straight lines (i.e. the lines x -f- y — — 2 etc.). These isoclines 
are depicted in Fig. 285. The direction of the field is indicated on 
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each of the isoclines. To find the direction of the field on a given 
isocline corresponding to the slope k of the field we can construct 
the rectangular triangle PQR having the base PQ — 1 parallel 
to the x-axis and the altitude QR = k. Then the side PR will indi- 
cate the desired direction. There are also several integral curves 
in Fig. 285 which are drawn in accordance with the directions. We 
see that the straight line x -f- y = — 1 is one of these curves. We 
also see that the locus of the lowest points of the integral curves 
is the straight line x + y — 0. In the case of an equation of general 
form (13) in order to find the loci of the highest and of the lowest 
points belonging to the integral curves it is necessary to construct 
the isocline / (x, y) — 0. (Think how we can find the locus of the 
points of inflection of integral curves in the general case.) 

4. Integrable Types of Equations. We say that a differential equ- 
ation is integrable by quadratures if its general solution is expres- 
sible in an explicit or implicit form which may contain quadratures 
(i.e. indefinite integrals) of some known functions. We consider the 
integration completed even if these quadratures are not in fact 
computed (the theory of the integral given in Chapters XIII and 
XIV deals with the methods of integrating functions). As we know, 
a quadrature may not be expressible in terms of elementary func- 
tions but nevertheless in this case we shall as well consider the 
integration of a differential equation completed. Unfortunately, 
even most of the simplest equations are not integrable by quadra- 
tures and it is therefore necessary to investigate them by means 
of some other methods which will be discussed later. But there exist 
certain classes of differential equations that are integrable by quad- 
ratures and we shall study them here. 

1. Differential equations with variables separable have been already 
dealt with (see Sec. XIV. 7). They have the general form 

■|r=/(*)<P(y) (15) 

and their general solution is expressed by the formula 

< 16 > 

We have written the arbitrary constant C (which we usually regard 
as being included into the sign of indefinite integral) in order to 
stress that the constant enters into the general solution. Thus, we 
see that equation (15) has been integrated by quadratures [this 
is expressed by formula (16)]. 

As an example, take a simple equation of the form 
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Separating the variables and performing integration we obtain 

~Y = 2 xdx, \n\y\ = x 1J r C, |;/| = e x2 + c , y—±e c e x2, (17) 

The answer can be written in a different form if we notice that 
the expression ±e c can also be regarded as an arbitrary constant. 
Therefore y = Ce x *. It is apparent that the symbol C in the last 
formula designates a new constant which is different from the one 
entering into formula (17). 

To avoid the change of notation we could write 

In | y | = x 2, -f- In C 

while integrating equation (17), since In C is also an arbitrary 
constant. Then, raising, we get 

I y | = Ce x % y = ±Ce x ' or, simply, y = Ce x ‘ 

because the signs + and — may also be included into C, that is C 
in the expression y = Ce x 5 is allowed to take on values of arbitrary 
signs. Further we are going to perform transformations of this kind 
without specific stipulations. 

We can similarly construct the general solution of the equation 
P (x) Q (y) dx -f R ( x ) S (y) dy = 0 

which is also an equation with variables separable. 

2. Equations homogeneous in the argument and in the unknown 
function. An equation of form (12) is said to be homogeneous in x 
and y if its left-hand side is a homogeneous function in x and y, 
that is if 

F (tx, ty, y') = t h F {x, y, y') 


(the general definition of a homogeneous function was given in 
Sec. IX. 12). 

Then equation (12) can be rewritten in the form 



Solving the last equation for y ' we obtain 


f/' = <p(i) . ( 18 ) 

This equation is easily integrated by means of the substitution 


y_ 

X 


= U, 


y=ux, 


y' = u'x J r u 


where u — u (x) is a new unknown f miction which replaces y. Sub- 
stituting y = ux into (18) we derive 


n'a;+n=(p(«), -^x=(p(u) — u, 


du 


dx 


y(u) — U 


X 
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and thus the variables have been separated. To complete the inte- 
gration we must solve the last equation and then return from u 
to the original unknown function y. 

Let us take a more general equation than (18), namely, 

dy / ax + by + c \ 
dr — ' \ mx+ny + p ) 

Suppose that the binomials ax -f- by and mx + ny are not propor- 
tional to each other. Then we can make a substitution of the form 
x = x y + a and y — y ± -f p where the parameters a and (5 should 
be chosen so that there should be no absolute terms in the denomi- 
nator of the resulting fraction. After that we substitute u for the 

ratio ~ and thus obtain an equation with variables separable. 

3. Linear equations. An equation of form (12) is called linear if 
its left-hand side is a linear function in the unknown function and 
its derivative, i.e. an equation of the form 

a (x) y’ -f b (x) y + c (x) = 0 

Dividing the equation by a ( x ) we obtain 

y' + p (*) y — f ( x ) (19) 

where p — — and / — — — . 

a a 

Equation (19) is called a homogeneous linear equation if / (x) 
is identically equal to zero. If otherwise the equation is called non- 
honiogeneous. To solve the general non-homogeneous equation of 
form (19) we first investigate the auxiliary homogeneous equation 

z' -f p (x) z = 0 (20) 

which corresponds to (19) and is obtained from (19) by dropping 
the non-homogeneity term / (x). The variables in equation (20) 
are easily separated: 

~= —p(x)z, ~= —p(x) dx, In | z | = — j* p(x) dx-\- In C, 

z = C exp [ — J p (x) dx] = Cz, (21) • 

where z i is obtained from z if Ave substitute (7 — 1. Thus, z 1 is a par- 
ticular solution of equation (20). 

After z has been found we seek a solution of equation (19) in 
the form v ' 

V = 9 (x) z t (22) 

where z, is the same as in formula (21) and cp (x) is a function yet 
unknown. Such a replacement of the former arbitrary constant 
entering into formula (21) by a function entering into formula (22) 
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is an application of a general method called the method of variation 
of arbitrary constants (parameters) which was introduced by Lag- 
range. 

Substituting (22) into equation (19) we receive 

qi'zi + <pz £ ' + P<pz i = /. <p'zi + 9 (z[ + pzi) = f 

The function z, satisfying equation (20), the expression inside the 
brackets equals zero. Therefore, 

1 P»=YW' W il+C ' 

J/ = Z|(*) j T^-^ + Cz,^) 

The last expression is the general solution of equation (19). The 
first term of the expression can be obtained by substituting C — 0. 
Hence, the first term is a particular solution of the equation. 

Thus, the general solution of non-homo geneous linear equation (19) 
is equal to the sum of a particular solution of the non-homo geneous 
equation and the general solution of the corresponding homogeneous 
equation. The equation 

y'+p(x)y = f(x) y n 

is called Bernoulli’s equation. It can be reduced to a linear equa- 
tion by means of dividing both sides by y n and introducing the 
change of variables of the form y l ~ n = u (check it up!). 

There are some other types of equation integrable by quadratures 
(see [24]) but most of the differential equations are not integrable 
by quadratures. For example, in the general case we cannot integ- 
rate by quadratures the equation y'—y 2 + f(x) which is called 
Biccati’s equation (named after J. Riccati, 1676-1754, an Italian 
mathematician). Riccati’s equations are applied to some problems. 
There are many other simple differential equations that are not 
integrable by quadratures. 

5. Equation for Exponential Function. Let us consider the equation 

~ = ky (k — const) (23) 

which indicates that the rate of change of the quantity y related 
to the quantity x is proportional to the current value of y. Hence, 
if y (x) > 0 then y increases for k > 0 and decreases for k < 0. 
Such a simple relationship between a quantity and its rate of change 
is often used as a first approximation in investigating various pro- 
cesses. 

The variables in equation (23) can be separated which yields 
~- = k dx, In | y | = kx -f In C, y = Ce hx 
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In addition, if there is an initial condition y ( x 0 ) = y 0 we obtain 
y 0 = Ce kx \ C = yoe~ hx °, i.e. y — y 0 e h (rc - ar ° > (24) 

Thus, the general solution of equation (23) is an exponential 
function (see Sec. 1.27). It is characteristic of this solution that if 
we make x assume the values forming an arithmetic progression 
with common difference Ax then the corresponding values of y form 
a geometric progression with common ratio e h&x . We can easily 
find the value of Ax for which y increases or decreases twice for 
every step Ax. Indeed, this being so, we must have 

\JcAx\ = In 2, i.e. Ax = -j|y- (25) 

If ^ > 0 and y 0 > 0 formula (24) describes the so-called law of 
exponential growth of the quantity y which is characteristic of 
different chain reactions. As an example, let us consider the process 
of reproduction of bacteria in a culture medium, when their number 
is not too large. We suppose that all the bacteria reproduce more 
or less independently. Then we see that the rate of growth of the 
number u of the bacteria measured in certain units is proportional 
to the number, i.e. 

= u — u 0 e h( - t ~ to) 

There are many problems of this kind that can he investigated 
in a similar way, the problem of calculating the growth of a capital 
deposited in a bank being one of them. 

If k < 0 then formula (24) expresses an exponential decrease 
of a quantity y. For example, this is the case when we investigate 
the process of a radioactive decay. In fact, let us denote the mass 
of a remaining (not disintegrated) radioactive substance by m. 
If we suppose that different parts of the mass are decaying indepen- 
dently then we conclude that the rate of decay of the mass is pro- 
portional to the current value of the mass, that is 

~^f= —pm, m = m 0 e- p( ‘ t - i o) 

In particular, note that after the elapse of time At = — the 

value of m becomes half as large, this being suggested by formula 
(25). The time interval At is known as the half-life period (or simply 
half-life of the radioactive substance). For instance, At is appro- 
ximately equal to 1.8 x 10 3 years for radium'. This means that if 
an initial mass of radium is not replenished then in 1.8 10 3 years 

we shall have half of the initial quantity and after another 1.8 10 3 

year period passes we shall have a quarter of the initial mass etc. 
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There are many phenomena, such as the decrease of the atmosphe- 
ric pressure with the growth of the altitude or the discharge of a 
capacitor through a resistance, which are investigated in a similar 
way. 

The equation of a problem can sometimes be transformed to form 
(23) by means of some simple techniques. For example, as it was 
shown in Sec. VIII. 7, the electric current flow i in a circuit con- 
sisting of a resistance R and an inductance L satisfies the equation 

L-^ + Ri = u (26) 

when a constant voltage u is applied to the terminals of the chain. 

Equation (26) is a non-homogeneous linear equation which can 
be integrated (solved) by means of the method described in Sec. 4. 
But it is easier to transform the equation in the following way: 



We obtain a simpler case when there is no initial current in the 
circuit. Indeed, let t 0 = 0 be the initial moment. Then u ( t 0 ) = 
= u (0) = u 0 = 0 and we receive the formula 

* = (27) 


The graph of relationship (27) is shown in Fig. 286. We see that 
the current increases and exponentially tends to the limiting steady- 

state value as t oo. This value can also be easily found from 

equation (26) if we take into account that 0 as t oo in the 

process of current rise. Therefore, in the limit we have Ri — u, 

he. i = 2j=r. Thus, when the current becomes practically steady- 

state the whole voltage drop is on the resistance. During the time 
period 


T = 


In 2 L 
B_ ~ R 
L 


In 2 


the deviation of the current flow from its limit value becomes twice 
as small. 
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The fact that it is the constant e that appears as the base of the 
exponential function in formula (24) is one of the main reasons 
which account for the important role of the constant in mathematics 
and its applications. 

6. Integrating Exact Differential Equations. A differential equation 
of the first order is often written in the symmetric form 

P (x, y) dx + Q {x, y) dy — 0 (28) 

in place of form (13). Here P (x, y) and Q ( x , y) are given functions, 
and it is the functional relationship between x and y that is con- 
sidered to he unknown. We can easily pass from one form to another. 



For instance, to transform equation (28) to form (13) we must divide 

both sides of (28) by Q dx and then transpose — to the right-hand 

side. Form (28) is preferable in those cases when the variables x 
and y are regarded as being equivalent, that is when we do not 
set beforehand which of the variables x and y is an argument and 
which is a function. 

There is a special case here when the left-hand side of equation 
(28) is the total (exact) differential of a function, that is 

P dx + Q dy = du (x, y) (29) 

Then we can easily integrate the equation. Actually, in this case 
the equation can be rewritten as du = 0 and hence, integrating 
we get the general solution 


u (x, y) = C (30) 

where C is an arbitrary constant, as usual. 

At the end of Sec. XIV.24 we obtained a condition guaranteeing 
the existence of such a function u (x, y), namely, the condition 


vj. 

dy dx ( 31 ) 

T^ 15 condition is necessary and sufficient for the left-hand side 
of equation (28) to be an exact differential. The function u (x, y) 
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is found according to formula (XIV.97) in which, of course, we must 
drop the last summand under the integral sign in our case. As was 
pointed out in Sec. XIY.24, generally we can arrive at a multiple- 
valued function u if the domain under consideration is multiply- 
connected. But even in this case formula (30) expresses the general 
solution of equation (28). 

As an example, let us take the equation 


(x- -f 2xy) dx -f (x 2 — y 3 ) dy — 0 

Here we have 

OP 3(ag + 2sy) p and 3Q 0{x^~y 3 ) 

dy dy dx dx 


2x 


(32> 


Thus, condition (31) is fulfilled. In order to apply formula (XIV.97) 
to constructing the function u we choose the point M o at the origin 
of coordinates, for definiteness. We also choose the path connecting 
M 0 with the variable point M (x, y) in the way shown in Fig. 287. 
Consequently, we obtain 


u (x, y) = j ]{x 2 + 2 xy) dx + ( x 2 — y 3 ) dy] = 

MoM 

= j I(z 2 + 2xy) dx + (x 2 — y 3 ) dy] -j- 

MqM' 

-f J [{x 2 + 2xy) dx + (x 2 —y 3 ) dy] (33) 

M'M 

We must put y — 0 and dy — 0 in the first integral and regard x 
and dx as x — const and dx = 0 in the second integral (why?). 
From this we receive 


* v 

u{x, y)= j x 2 dx+ j (x 2 — y 3 )dy = ^- + x 2 y — ~ (34) 

0 0 

Hence, the general solution of equation (32) has the form 



Let us similarly treat the equation 


y dx x dy 

xt-l-yi * 


(35) 


By the way, the equation can be easily integrated in a direct way 
because the variables in the equation are separable. The equation 
makes sense for all values of x and y except x = 0, y = 0 since 
both coefficients P and Q are discontinuous at the point (0, 0). 
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We must therefore take the x, y-plane with the origin of coordinates 
punctured. A plane with one point punctured is not simply-con- 
nected. As it was shown in Sec. XIV.24, it is doubly-connected. 

Condition (31) is also fulfilled for equation (35) (check it up!). 
In constructing the function u in accordance with formula (XIV. 97) 
we can choose the point Mq anywhere except, of course, the origin 
of coordinates. For instance, let us take the point Mo (1 » 0)* W© 
first consider the case x > 0. Performing calculations analogous to 

(33) and (34) we find u = arc tan (check up the result!). The 
same function u = arc tan — satisfies relation (29) for x <C 0 too. 

X 

But if we consider the function u = arc tan to he defined over 

the whole x, y-plane (with the point (0, 0) removed) it will he 
discontinuous on the straight line x = 0. To get rid of the discon- 
tinuity we can put 

u -- Arc tan — = (p 

X 

where cp is the polar angle of the point (x, y). This function is not 
single-valued. Even if we take a certain point M 0 and choose 
a certain value of cp for the point, the argument cp will gain the 
increment 2xt after M traces a closed path round the origin. But 
nevertheless the general solution of equation (35) is of the form 

Arctan-j = C, i.e. -^-=tanC — C x and y = Cyc 

where is an arbitrary constant. From the geometric point of view 
we have obtained the totality of all the possible straight lines passing 
through the origin of coordinates. 

It may happen that condition (31) does not hold for equation (28). 
Then the left-hand side of such an equation is not a total differen- 
tial. It can be shown that in this case there always exists a factor 
such that the equation becomes exact after being multiplied by 
the factor. Then the equation can be integrated provided the factor 
is known. For example, the left-hand side of the equation — y dx + 
+ x dy = 0 does not satisfy condition (31). But if we multiply the 

equation by the factor — -j- ^ the equation is transformed to 

form (35), which satisfies condition (31). Generally, such a factor 
is called the integrating factor of equation (28). There are no general 
techniques for finding an integrating factor but it can he found 
for certain specific classes of differential equations. Besides, the 
concept of an integrating factor is used in some theoretical inve- 
stigations. 
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7. Singular Points and Singular Solutions. There are some cases 
when a first-order equation written in the form 

y’ = / y) (36) 

[see equation (13)1 or in the form 

P (x, y) dx + Q (x, y) dy = 0 (37) 

[see equation (28)] possesses more than one integral curve passing 
through some point in the x, g-plane or has no such curves. The 
points of this kind are called singular points of the equation in 
question. They can either be isolated or form entire singular curves. 

We begin our investigation of equation (36) with a simple special 
example, namely, with the equation 

y' = y a (a > 0) (38) 

Let us regard y ( x ) as being non-negative (y 0). Equation (38) 
can be easily integrated: 

— — dx, — x—C for a^=l and In y — x—C for a = l (39) 

yO. 2 C£ 

Here we have written — C instead of C for the convenience of our 
further considerations. The sign in front of C does not matter because 
C itself can have any sign. Let us distinguish between the following 
two cases. 

1. a >1. Then solution (39) can be put down as 

1 1 const 

-j— i = nr 

/ a v 1 //"i .C~*l _ . tt** 1 

(a— 1) (C—x) (C — x) 

which implies that if x varies from — oo to C then y increases from 
zero to infinity. The graph is shifted along the x-axis as the constant 
C is changed. The family of integral curves thus obtained is depicted 
in Fig. 288. The x-axis itself is an integral curve. It can be obtained 
as the limit when C — oo. We see that in this case, for each point 
of the upper half-plane, there is one and only one integral curve 
passing through the point. Our example also shows that the solution 
y (x) of an equation may not be defined over the whole x-axis and 
may exist only for a certain part of the axis. Indeed, in this example 
y (x) exists only on the interval — oo < x < C. 

In the case a = 1 we also get a similar uniqueness (check up this 
assertion!). 

2. 0 < a <C 1. Then we can write solution (39) in the form 

j i i 

y= (1 — a) 1_a (x—C) l ~ a = const (x — C ) 1-0 


(40) 
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from which it follows that if x varies from C to co the variable y 
increases from zero to infinity. The corresponding family of integral 
curves is shown in Fig. 289. The x-axis is an integral curve again, 
which is evident from equation (38), but now it cannot he obtained 
from formula (40) for any value of C. In this case, for each point 
of the x-axis, there are two distinct integral curves passing through 
the point, namely, the x-axis itself and the corresponding curve 
defined by formula (40). Recall- that the problem of constructing 



an integral curve passing through a given point of the plane is 
called the Cauchy problem. Thus, in our case the uniqueness of 
solution of the Cauchy problem is violated. 

We see that in the second case the points belonging to the x-axis 
become singular points. To find out what is the cause of this pheno- 
menon we compute the derivative of the right-hand side of equation 
(38) at these points, that is for the values y — 0. Calculating we 
obtain 


d{y a ) 


and 


( 9(y a ) \ 

\ dy ) v= 

{ £ (y a ) \ 

\ dy I y= 


= 0 

for a>l, 

for 

a — 1 

for 

a< 1 


Therefore, as a point (x, y ) approaches the x-axis in the case 0 < 
< a <; 1, the direction of the field changes so fast that every inte- 
gral curve approaches the axis at a finite point hut not at infinity 
as we had in Fig. 288. J ' 

We see that in the case under consideration the conditions of 
theorem on existence and uniqueness of the solution are 
not fulfilled because they include the requirement that the deriva- 

tlve dy should 136 finite. In other cases we can also obtain more than 


33-0141 
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one integral curve passing through a point M 0 if the value of the 
derivative is infinite at the point M 0 although this is not 


a necessary consequence of the fact. 

In particular, if approaches infinity on a curve ( L ) and if the 

curve itself is an integral curve then, as a rule, besides ( L ), there 
is at least one more integral curve passing through each point of ( L ). 
In this case we say that ( L ) is a singular integral curve which means 
that ( L ) is an integral curve whose all points are singular points. 
The corresponding solution for which a singular integral curve 
serves as its graph is called a singular solution. A singular solution 
does not usually enter into the general solution, that is, as a rule, 
it cannot be obtained from the general solution for any value of the 
arbitrary constant. For instance, the x-axis is a singular integral 
curve and the function y == 0 is the corresponding singular solution 
of equation (38) in the case 0 < a < 1 (why?). 

There is another approach to the notion of a singular solution. 
Fig. 289 shows that the x-axis is the envelope of the family of inte- 
gral curves ( see Sec. XII. 5) in the case when 0<a<l for equa- 
tion (38). 

In the general case the envelope of a family of integral curves 
is also an integral curve because its tangent coincides with the- 
direction of the field at every point provided such an envelope exists. 
At the same time such an envelope is a singular integral curve since 
there are other integral curves passing through the points of the 
envelope. This leads to a method of finding singular solutions based 
on the consideration given in Sec. XII. 5. Suppose we have managed 
to obtain the general solution in the form ® (x, y, C ) = 0. Then, 
by Sec. XII. 5, we can find a singular solution if we eliminate C 
from the equations 


0) (x, y, C) = 0 and <D' C (x, y, C) = 0 (41) 

[let the reader perform the calculations for solution (40)]. 

We now proceed to investigate equation (37). For the sake of sim- 
plicity, let us suppose that the functions P and Q are continuous 
and that their derivatives of the first order are finite. Equation (37) 
can be rewritten in the form 


dy _ P (x, y) dx __ Q (x, y) 

dx Q (x, y) dy P ( x , y) 


(42) 


We can therefore apply above-mentioned Cauchy’s theorem concer- 
ning equation (36). Thus, if Q (x 0 , y 0 ) ^OorP (x„, y 0 ) ^0 at 
a point Mo (xo, yo) then there exists a unique integral curve passing 
through the point M 0 (x 0 , y n ). [To apply Cauchy’s theorem it is 
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sufficient to denote one of the right-hand sides (42) which has a non- 
zero denominator as / (x, y)-\ But if 

P {x 0 , Vo) = 0, Q { x 0 , yo) = 0 _ 

then equation (37) no longer defines a certain rektionsMp between 
dx and dy at the point M 0 (*<>, do) and therefore the direction field 
turns out to be undetermined at the point. Hence, the singular 
points of equation (37) are defined by relationship (43). 



Fig. 290 

Singular points of differential equations 

(a) Nodal point: y dx — x dy = 0, y = Cx 

(b) Nodal point: 1y dx — xdy = 0, y ■— Cx* 

(c) Saddle point: y dx + xdy = 0, xy -- C 

(d) Centre: x dx ~f- y dy = 0, x 2 + y 2 = C 

(e) Focal point: (x + y) dx — (x - y) dy = 0, p = 
(in polar coordinates) 


In Fig. 290 we give some examples of the most widely encountered 
types of singular point together with their names. (Let the reader 
verify the solutions written in Fig. 290 and the forms of their graphs 
which are also shown there.) The origin of coordinates is the only 
singular point in all these examples. In examples (a), (b) and ( e ) 
there are infinitely many integral curves passing throngh the singular 
point; there are two such curves in example (c) whereas there are 
no integral curves passing through the point in example (d). It 

33 * 
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should be noted that the coordinate axes themselves are integral 
curves in examples (a), ( b ) and (c). To integrate equation ( e ) 
in Fig. 290 it is convenient to transform the equation to polar coor- 
dinates. 

8. Equations Not Solved for the Derivative, The equation 

F (. x , y, y') = 0 (44) 










g|§ m 


HHg 

WM 1 


differs from equation (13) investigated in Sec. 3 because in this 
case y' is an implicit function of x and y. A characteristic feature 

of implicit functions is that ge- 
nerally they are many-valued 
(see Sec. 1.20). Therefore, if we 
solve equation (44) for y' (which 
is theoretically possible but may 
be difficult to realize practically) 
we shall get several solutions in 
the general case: 

y'=f\(x,y), y' =/ 2 (*, y), . . . 

y' = fk (x, y) (45) 

Each of the solutions satisfies 
equation (44). 

Each of equations (45) defines 
its own direction field in the plane 
and generates a family of integral curves which cover the plane 
(see Sec. 3). Therefore if in a certain part of the x, p-plane equation 
(44) possesses k solutions with respect to y' we have the superposi- 
tion of these k direction fields. Hence, there are k integral curves 
passing through each point of the part of the plane, that is an initial 
condition y (a: 0 ) = y 0 defines k solutions (see Fig. 291 where we 
have taken k = 3). 

In Sec. 7 we considered singular points, singular curves and sin- 
gular integral curves of equation (36), and equation (44) may like- 
wise have them. Equation (44) suggests that 

F '„ (*> y . !/') 


Fig. 291 


d£_ 

<>y 


F'y (*> y> y') 


(see Sec. IX. 13) and therefore, by Cauchy’s theorem (see Sec. 3), 
such points and curves may appear only if equation (44) and the 
equation 

F'v' lx, y, y) = 0 (46) 


hold simultaneously (of course, if F' v oo). 

In particular, a singular solution (provided it exists) whose graph 
is the envelope of a family of integral curves can be obtained by 
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eliminating y' from (44) and (46). There is another method of con- 
structing a singular solution based on formulas (41). 

9. Method of Integration by Means of Differentiation. There are 
certain cases when equation (44) can be integrated if it is differenti- 
ated beforehand. For instance, let us take the equation 

* = / (*/') * ' («) 


Such equations are usually written in the form 

x = / (p) (48) 

where p = y'. 

Differentiating both sides we derive 

dx — /' ( p ) dp 

By means of the last equality and formula — p we find the follow- 
ing expression of dy: 

dy — p dx = pf (p) dp 

This implies 

y = J Pf ( P ) dp + C (49) 


Equalities (48) and (49) simultaneously define a functional rela- 
tion between x and y which is expressed parametrically (see 
Sec. II. 6), the variable p being the parameter. Thus we have obtai- 
ned the general solution of equation (47) in a parametric form. An 
equation of the form y = f {y') can be solved in a similar way 
(verify it!). 

The so-called Lagrange’s equation of the form 
V = / (*/') * + g (l/')i i-e. y = f {p) x + g {p) {p = y') (50) 


is a little more complicated than (47). The equation is linear in the 
variables x and y but it is non-linear in the sense of the definition 
given in Sec. 4. After the equation has been differentiated we get 

dy = p dx = f (p) dp x + / (p) dx + g' (p) dp 

that is 

lP-f(p)]~ = f ( P)x + g'(p ) 

If / (?) # P we can divide both sides of the equation by p — / (p) 
and thus obtain a linear -equation [see equation (19)] in which x 
is regarded as a function of p. After the last equation has been inte- 
grated we obtain a relationship of the form x = x {p, C). This 
relation together with relation (50) defines the general solution 
of the original equation in parametric form. 

In a special case when f (p) ~ p equation (50) is called Clairaut’s 
equation after the French mathematician A. Clairaut (1713-1765) 
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who was the first to investigate the equation in 1734. It has the form 
y — xy' -r g iy'), i-e. 

y = xp -T g (p) (p = y') (51) 

The difierentiation yields p dx = p dx -f- x dp -j- s’ (p) dp, and 
thus we have 

dp lx + g' 0)1 = 0 (52) 

Equating the first factor to zero we obtain, by (51), 

p — C, i.e. y = Cx *f g (C) (53) 

This is the general solution of equation (51). 

Equating to zero the second factor entering into the left-hand 
side of (52) we deduce 

* = — S' (P), y =xp -f g(p) = —pg r (p) + g(p) (54) 

Thus we have obtained one more solution, a singular solution, which 

is not contained in the general so- 
lution and is represented paramet- 
rically. Geometrically, formula (53) 
defines a family of straight lines 
(why?). Formula (54) defines the 
envelope of the family. [Verify the 
last assertion on the basis of equa- 
tion (53).] 

For instance, the equation y — 
= xy' — y' 2 has the general solu- 
tion 

y = Cx~ C 2 (55) 

and the singular solution whose 
graph is the envelope of the 
292 family of straight lines (55). To 

find the envelope w r e diSerentiate 
both sides of (55) with respect to C which yields 

0 = x — 2C 

Eliminating C from the last two formulas we get C = . This 

results in 



The corresponding integral curves are depicted in Fig. 292. 
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§ 3. Higher-Order Equations and Systems of Differential 

Equations 

10. Higher-Order Differential Equations. Some general notions 
related to such equations were given in Sec. 2 [namely, equation (5), 
general solution (8) or (9) and initial conditions (11)1. By the way, 
usually it is easier to investigate an nth-order equation when it 
is given in the form solved for the highest derivative: 

y in) = / (x, y, y, . . ., y (n ~ l) ) 

In particular, Cauchy’s theorem is immediately extended to such 
an equation (see Sec. 3): if the function / is continuous and has 
finite partial derivatives of the first order with respect to y, if, . .• . 
. . .. p (r,-1) at a given initial point defined by conditions (11) then 
there exists a solution of the equation satisfying the initial con- 
ditions and the solution is unique. 

We now consider some particular types of higher-order equations 
that are integrable by quadratures. For higher-order equations such 
cases are still rarer than for the first-order equations. Here we shall 
consider non-linear equations (linear equations will he treated in 
detail in § 4). The main method for formal integration of non-linear 
higher-order differential equations is the method of reducing the 
order. It enables us to pass to an equation of a lower order which is 
equivalent to the original equation. As a rule, the lower the order 
of an equation, the easier the integration. Besides, we can sometimes 
come to a first-order equation belonging to one of the integrable 
types (see Sec. 4) after the order of the original equation is reduced 
several times. In such a case we are able to complete the integration. 
Here we shall consider certain particular methods of reducing the 
order. Some other techniques can be found in [24]. 

1. For example, let us take the equation of the second order 

y ' 2 + yy" — o ■ • 

To integrate it observe that its' left-hand side can be rewritten 
in the form 

y ' 2 + yy" ^ (yy'Y 

whence ( yy ')' = 0, yy' = c u y dy = C, dx and 

^-=C iX +C 2 

(which is the general solution). 

Further, an equation of the form 


y 12 - yy " = o 
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can be easily integrated in a similar way if we divide both sides 
by y 2 beforehand. Indeed, 




— = Cj and — = C t dx 

y y 


which results in the general solution 

In | y | = Cix -f- In C 2 , y — C 2 e Ci* 


Here, after the division by y 2 , we have obtained a so-called inte- 
grate combination , that is an expression whose “exact derivative” 
is equated to zero. Such a method is sometimes applicable to other 
equations. 

For the sake of simplicity, we shall further consider the cases 
when the order can be reduced for an equation of the second order 
of the form 

F ( x , y, y', y") = 0 (56) 

2. Let an equation of form (56) not contain y. Let it contain 
only the derivatives of y and the argument. This is an equation 
of the type 

F (*, y\ y") = 0 (57) 

Introducing the notation i/ = p = p (x) we obtain the equation 

F ( x , p , p') = 0 

which is implied by (57). Thus, we have got a first-order equation. 
Suppose we have managed to integrate this equation and have 
obtained its general solution p = q> (x, C,). Then we have y’ = 
= q> ( x , Ci) and therefore the general solution of equation (57) 
is thus obtained: 

y— j <P(*, Ci) dx + C 2 

3. Let equation (56) not contain x, i.e. let it be of the form 

F (y, y', y") = 0 (58) 


Then we again put y' = p but regard p as being a function of y. 
It should he noted that here we cannot simply substitute the expres- 
sion y" = p' for the derivative y" into (58) because p' is the deri- 
vative of p with respect to x but not to the new argument y. There- 
fore we write 

„ _ d(y') _dp_ = d V _ dp 

* dx dx dy dx " dy 

Now we deduce from equation (58) the equation 

dp 
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which is a first-order equation. If we manage to integrate it and to 
find its general solution p = tp (y, Ci ) then we have ■— = cp (y, C i) 
and therefore the general solution of equation (58) can he directly 
written in the form 


4. Let the left-hand side of equation (56) be a homogeneous func- 
tion in the variables y. y' and y” (see Sec. IX. 12): 

F (x, ty, ty ty") = t k F (x, y, if, if) (59) 

In this case we can reduce the order by means of the substitution 



= u (*) 


Indeed, the substitution results in 

V' = uy, y" — u'y + uy' = u’y + u-uy = (u' + u a ) y, 

F (x, y, yu. y (u' -f u 2 )) = 0 

and thus F (x, 1, u. u' + u 2 ) = 0. While writing the last expres- 
sion we have used property (59). If we manage to integrate the 
first-order equation thus obtained we have 

z/ = cp(x, C t ), -|-=q>(x, 

From this we deduce the general solution of the original equation: 
In | y | = f <p(x, C t )dx + lnC lt i.e. y = cJ* {x ’ Ci)dx 

v 

11. Connection Between Higher-Order Equations and Systems 
of First-Order Equations. A liiglier-order equation of form (5) can 
always be reduced to a system of n equations of the first order con- 
taining n unknown functions. To do this it is sufficient to introduce 
the notation 


V — l Ju y r — y 2 , • • !/ (n-1) = y n (60) 

Then, by equation (5), we can write 


y\ = yz ■} 

y ^ ~ 1/3 I 


Vn— i — y n 

F ( x , yu yz , • • y n , yn) = 0 


(61) 


System (61) is of a particular form. The general form of a firct 
order system (for the case of three equation's containing three ut 
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known functions) is written as 

Fi (x, yu y 2 , y 3 , yd yd vd — 0 ] 

f 2 (x, yi, y 2 , y 3 , yi, yd yd = 0 > ( 62 ) 

F s (x, y u y 2 , y 3 , yd yd y's) — 0 ] 

The general system of n equations in n unknown functions has a si- 

milar form. 

Conversely, a system of form (62) can be transformed to one equa- 
tion (in this case to a third-order equation and in the general case 
to an ?ztk-order equation) in one unknown function (for example, 
the function y ,). Therefore the general solution of system (62) con- 
tains three arbitrary constants: 

Vi — tpi ( x , Ci, C 2 , C 3 ), y 2 — q>2 i x , C j, C 2 , C 3 ), 

Va ~ tfs (®» Ci, C 2 , C 3 ) 

In order to deduce an equation of form (5) (with n = 3) from 
system (62) we should differentiate each of the equations twice. 
Then together with equalities (62) we obtain nine relations from 
which we can eliminate the eight quantities y 2 , y 3 , yd yd yd vd Vd 
and y'l- This yields the sought-for third-order equation. After the 
equation has been integrated we obtain the general solution y t = 
= qq ( x . Ci, C 2 , C 3 ). To complete the process of solving the system 
we must find y 2 and y 3 . But this can be performed without integra- 
tion since y 2 and y 3 are expressed in terms of iji and its derivatives 
by means of the above-mentioned relations. 

A general system of differential equations of arbitrary orders 
can be transformed into a system of first-order equations by intro- 
ducing notation similar to (60). For example, suppose we have 
a system of two equations. Let the equations be of the third order 
with respect to one of the unknown functions and of the second 
order with respect to the other. Then such a system is equivalent 
to a system of five first-order equations containing five unknown 
functions. The general solution of the original equation in this 
example must contain five arbitrary constants. 

12. Geometric Interpretation of System of First-Order Equations. 
For simplicity’s sake, let us take the case of a system of two first- 
order equations in two unknown functions y, ( x ) and y 2 ( x): 

Fi (x, yi, y 2 , yd y'd = 0 
F 2 (x, iji, y 2 , yd y'd = 0 

If it is possible to solve the system for y[ and y'„ before performing 
integration we obtain a system of the form 

y[ = fi (x, yu yd 
y'i = iz {x, yu y 2 ) . 

Then we say that the system is written in the normal form. 


(64) 



DIFFERENTIAL EQUATIONS ' 


523 


A solution of (63), or of (64), is, of course, a pair of functions 

iji = iji (x) and y 2 = ( x ) ( 65 ) 

which converts Loth equations into identities. According to Sec. 11, 
the general solution of the system contains two arbitrary constants, 
that is it has the form 

Hi — yi (x, C \ , C 2 ), 7/2 = y z ( 3-1 ('ll ^ 2 ) 

System of equations (64) and its solution (65) can he simply 
interpreted geometrically if we introduce a three-dimensional 
Cartesian space with the coordinates x, p, and y 2 and regard it as 
a usual geometric space. Then 
formulas (65) yield the parame- 
tric representation of a curve (see 
Sec. VII. 23) if we consider x to 
be a parameter (we can formally 
write an additional equation of 
the form x = x). This curve is 
called an integral curve of system 
of equations (64). Let us take an 
arbitrary point M (x, y u y^t in 
the space (see Fig. 293) and cal- 
culate the values of the right- 
hand sides of system (64) at the 
point. These values being equal 
to y[ and y’„, we thus determine 
the directions of the tangent 
lines to the curves iji — t/j ( x ) 
and y 2 = y 2 (x), i.e. to the projections of the integral curve. The- 
refore we can determine the direction of the tangent to the 
integral curve at the point M provided it passes through M. 
Point M being an arbitrary point, we see that system (64) defines 
a direction field in the x, y it y 2 -s pace. Hence, an integral curve 
is a curve whose tangent goes along the direction of the. field at 
each point (compare with Sec. 3). 

The variables y t and y 2 are involved equivalently in system (64) 
whereas the variable x plays a specific role. But there are such cases 
when all the three variables are equivalent so that each of them 
can be taken as an independent variable. It is preferable to write 
a system of this kind in the so-called symmetric form 

dx dy dz 

P {x, y,z)~ Q (X, y, z) ~ ~R (x, y, z) ( G6 ) 

It is easy to obtain form (64) from form (66) and vice versa (how 
can we do this?). v 




524 


INTRODUCTORY MATHEMATICS TOR ENGINEERS 


The geometric interpretation of system (66) is analogous to the 
above interpretation of system (64). By relations (66), the vector 
dr = dxi + dyj + dzk must be parallel to the known vector Pi + 
+ + i?k at each given point M ( x , y, z) (why?). The problem 
of integrating system (66) is therefore reduced to the problem of 
constructing curves in space which have prescribed directions of 
their tangents at each point. 

The geometric meaning of system (64) implies that in order to 
isolate a unique integral curve we must specify a point M 0 fro, 2/o> z 0 ) 
in space through which the curve should pass. In other words, the 
initial condition 

yi fro) = 2/i,oi 2/2 fro) = 2 / 2,0 

uniquely defines a certain solution of system (64). As in Sec. 7, 
in the case of a system of equations we can also encounter singular 
points and singular curves. These points and curves can be distin- 
guished in a manner similar to the one described in Sec. 7. In par- 
ticular, a singular point of system (66) is a point at which all the 
three denominators vanish. Every point of this kind is a singular 
point; the vector Pi -f- Qi + turns into zero at such a point 
and thus it has no certain direction at the point. 

A normal system of first-order equations with an arbitrary number 
of unknown functions has the general form 

y\ = fi fr, 2/li 2 / 2 . • • •, 2/n) 

2/z = fz fr, 2/i, Vz 2/n) 

Vn = f n fr, 2/1, 2/2, • • -, 2/n) 

A solution of the system is a set of functions of the form 

2/i = 2/i fr), 2/2 = 2/2 fr), - ■ 2/n = 2/n fr) (68) 

which reduces the system to an identity. The general solution of 
such a system contains n arbitrary constants. 

To specify a unique solution we can set initial conditions of the 
form 

2/1 fro) = 2 / 1 , 0 , 2/2 fro) = 2 / 2 , 0 , • • •, 2/n fro) = 2/n.o (69) 

Cauchy proved that there is a solution of system (67) satisfying 
conditions (69) and that the solution is unique if the right-hand 
sides of system (67) are continuous and possess finite partial deri- 
vatives of the first order with respect to the variables y u y 2 , . . ., y n 
for the values x = 2/i = 2/i,o, • - •, 2/n = 2/n.o- 

The geometric meaning of system (67), its solution (68) and con- 
ditions (69) is that the system defines a direction field in an ( n + 1)- 
dimensional space with the coordinates x , y u . . ., y n and that 
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the solution of the system satisfying conditions (69) defines an 
integral curve lying in the space and passing through the point 
Mo (®o, 1 / 1 , 0 , • • •, ifn. o) determined by (69) (see Sec. VI 1.18). 

In case the right-hand sides of system (67) do not depend on the 
variable x the system is called autonomous. It turns out that it 
is more convenient to regard its solutions as curves lying in the 
/i-dimensional space of the variables y u z/ 2 , • • •, l/n which is a sub- 
space of the x, y u . . ., 7/ n -space, the variable x playing the role 
of a parameter. In such a case the space of variables y lt . . ., y n 
is called the phase space of the system and the curve y x (x), . . . 
. . y n (x) is called a phase trajectory of the system. A point 
(y u . . y n ) of the rc-dimensional space is called a state of the 
system and thus the space y u . . ., y n may be called a state space. 
For simplicity’s sake, let us limit our attention to the case n = 2. 
We shall denote the independent variable by t and interpret it as 
time. The unknown functions i/i and y 2 will be designated by the 
letters x and y, i.e. x — x {t) and y = y (t). Then, in place of system 
(67), we get a system of the foTm 

dx 
~dt 


-- P (x, y ) , ^- = Q(x,y) 


Multiplying the first equation by i and the second equation by j 
and adding together (in the vectorial sense) the results we arrive 
at the vector differential equation 


-§ = A (x,y), or -g- = A(r) (70) 


where A = P (x, y) i -f Q (x, y) j is a given vector field in 

the phase plane x, y. The derivative ^ being the velocity (see 

Sec. VII. 23), we see that there is a velocity field defined in the 
x, y-plane. Each solution r (/) = x (/) i + y (t) j defines the law 
of motion of a point in the plane, and the point has a prescribed 
velocity at each of its positions. We can imagine that equation (70) 
defines a flow of liquid in the phase plane and that to solutions of 
the equation there Correspond laws of motion of particles of the 
liquid (but in fact the equations of motion of a liquid medium are 
much more complicated; such equations are deduced in hydrody- 
namics and they prove to be partial differential equations). The 
autonomy of equation (70) implies that the “flow” is stationary and 
therefore any two distinct trajectories have no common points. 

For. example, let us write equation (4) in the form of an autono- 
mous system of first-order equations: 





— ky 


(71) 


where y is the coordinate of the oscillating point and v is its velocity. 
In courses on physics one can find the following formula of the 
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total energy of an oscillating point: 


Mv* , by* 
2 ' 2 


( 72 ) 


(let the reader think about the structure of the formula). In the 
process of free oscillations without friction the energy must be 
preserved. Indeed, computing the derivative of E with respect to t 
on the basis of formula (71) we obtain 

= — kyv-\-kyv = 0 

he law of conservation of energy 
for our example. Hence, we 
have E — const for every 
solution of system (71). The- 
refore we see that a point 
representing the state of the 
system in the phase plane 
y describes an ellipse in the y, 
u-plane. Different ellipses cor- 
respond to different possible 
oscillations of the material 
point, and to different ellipses 
there correspond oscillations 
about the equilibrium posi- 
tion with different amplitu- 
des (see Fig. 294). 

13. First Integrals. For definiteness, we now consider a system 
of three first-order equations of form (62). Every relation of the form 

® {x, y t , i/ 2 , y 3 , C) ~ 0 (<D $ 0) (73) 

which is identically satisfied by any solution of the system is called 
a first integral of the system of equations. Here C is a constant which, 
in general, is different for different solutions. 

Knowing a first integral we can reduce the number of equations 
in the system by unity. Indeed, if we express y 3 in terms of the rest 
by means of relation (73) and then substitute the result into the 
first two equations (62) we obtain a system of two first-order equ- 
ations containing two unknown functions y t and y 2 . If we integrate 
the last system, i.e. if we find y % (a:) and y 2 ( x ), then y 3 ( x ) can be 
found without integration on the basis of equality (73). 

Similarly, if we know two independent first integrals we can 
reduce the number of equations by two. Finally, if we manage to 
find three independent first integrals (that is such integrals that 


This is a mathematical proof of tl 
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none of them is a consequence of the others) of the form 

(x, Uu V*, Us, Ci) = °) 

(x, I/i, y 2 . i/a. C 2 ) = o> 

®3(^i V\1 V 25 1/3’ £ 3 ) 

then we obtain the general solution of system (62) represented in 

an implicit form. , , , . . 

It is sometimes possible to find first integrals by deducing so- 
called integrable combinations from given equations. For instance,, 
we can easily derive such a combination for the system 


y' = y + * \ 

2' = -y + zJ 

Indeed, we see that 

yy' + zz' = y (y + z) -f 2 (— y + z) = y 2 + z 2 
and therefore 

i(!f-+z7=!< s +A d{ ^$ l = 2ix and 

In (y 2 + z 2 ) = 2a: + In C 


(74> 


Finally, we obtain the first integral 

y 2 + z 2 = Ce 2x 

It follows that if x 00 then every solution approaches infmity T 
and if a; — s- — 00 it tends to zero. As a rule, in other cases we can 
also draw some important conclusions concerning the behaviour 
of solutions without completing integration of a system in question 
when we know one or more first integrals of the system. Returning 
to system (74) we see that we can receive another first integral if 
we divide one of the equations by the other (let the reader perform 
the calculations!). 

There are some cases when first integrals can be derived on the- 
basis of certain physical considerations, more often by different 
conservation laws. 

For instance, relation (72) is a first integral of system (71) because 
we can regard E as an arbitrary constant C. Expressing v in terms 
of y from the relation and substituting the result into the first equ- 
ation (71) we can easily complete the integration (let the reader 
do this!). 

In conclusion, we note that, as it has been seen, it is more natural 
to consider systems of equations in which the number of equations 
coincides with the number of unknown functions. Such systems are 
called closed. If the number of unknown functions exceeds the 
number of equations the system is called non-closed (or sub-definite). 
The excessive number of unknown functions can be chosen arbi- 
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trarily. The fact that a system is non-closed usually indicates that 
some necessary relations were not taken into account. If the number 
of equations is greater than the number of unknown functions the 
system is called overdetermined. Such a system is usually contra- 
dictory, that is it has no solutions. An overdetermined system can 
also indicate that there is an interdependence between the equations 
entering into the system in question. This means that some of the 
equations are consequences of the rest and therefore they are unne- 
cessary and can be dropped. Such a situation may also show that 
when deducing the equations we made a mistake. 

§ 4. Linear Equations of General Form 

14. Homogeneous Linear Equations. The methods of investigating 
linear equations of arbitrary orders have many features similar 
to those of the methods of solving first-order linear equations (see 
Sec. 4). But in the general case it is no longer possible to integrate 
an equation by quadratures. For the sake of simplicity, we first 
consider second-order equations. The equation 

a" + p (x) z' + q (x) z = 0 (75) 

whose left-hand side is a linear function in the unknown function 
and its derivatives is called a homogeneous linear equation. 

For brevity, let us denote the left-hand side of equation (75) as 
L [z], i.e. in this case, by definition, 

L [z] == z" + p (x) z' + Q (z) z 
Then equation (75) can be rewritten in the form 

L [z] = 0 

The expression L [z] possesses the following properties: 

L [z, + z 2 ] = (z, -J- ztf -J- p (x) (zj -j- z^' -J- q (x) (z t + z 2 ) = 

= « + P (x) -f q ( x ) Zj) + (z" + p (x)z' 2 + q (x) z 2 ) = 

= L [z t ] + L [z 2 ], L [Cz\ = CL [z\ 

where C = const (the last property is verified similarly). 

Expressions of this type are called linear operators; we mentioned 
them in Sec. XIV.26. 

We can easily prove the following properties of equation (75): 

1. A sum of solutions of equation (75) is a solution of the same 
equation. Actually, if z t and z 2 are two solutions of equation (75) 
then 


and thus 


L [zj] ■ 0 and L [z 2 l = 0 
L [zj -f- z 2 ] = L [zj] -f- L [z 2 ] = 0 
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2. If we multiply a solution of equation (75) by a constant we 
get a solution of the same equation. The verification of the second 
property is analogous to that of the first property. 

Now we can combine properties 1 and 2 in the following way: 
a linear combination of solutions of equation (75) (see Sec. VI 1. 5) 
is a solution of the same equation. For instance, if z t ( x ) and z 2 ( x ) 
satisfy equation (75) then 

z = CiZi ( x ) -f £7 2 z 2 ( x ) (76) 

also satisfies the same equation for any values of the constants 
Ci and C 2 . 

3. A function which is identically equal to zero always satisfies 
equation (75). 

4. If we know a nonzero solution of equation (75) we can reduce 
by unity the order of the equation. Virtually, let z t ( x ) be such 
a solution. Make a substitute of the form z = zpi where u — u ( x ) 
is a new unknown function. Then we get 

(z"u + 2 z[u' + z iu") + P ( z i u + z i u') + Q z i u = 0 

that is 

Zjii -{- (2z x -j- pzi) u' -{- (z x -j- pZy -f- gZj) u — 0 

But since L [z^ = 0, the last term vanishes and hence, after the 
substitution u' = v, we finally receive 

Ziv' + (2z\ + pz j) v ~ 0 

which is a first-order linear equation, that is an equation whose 
order is less by unity than the order of the original equation. 

Now we complete the integration: 

<^_ = _ 2 -i+P-i lnjy|= — 2 In 1 2 , | — J p (x) dx-\~lnC 2 , 

»=■%■ e-l*”**, u = C,_ We->™*dx+C t 
"1 J 

and hence 

2 = C 1ZI+ C 2 Z, (77) 

The function in front of which the factor C 2 is placed is one of the 
particular solutions of equation (75) because we can obtain it from 
general solution (77) by putting Ci = 0 and C 2 = 1. Therefore, 
denoting this solution by z 2 we arrive at the fifth property. 

5. The general solution of equation (75) has form (76) where C { 
and C 2 are arbitrary constants and z t and z 2 are two particular solu- 
tions of the equation. 

34-0141 
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It should be noted that formula (76) in fact expresses the general 
solution of equation (75) if and only if the solutions z i and z 2 are 
linearly independent. The concept of linear dependence of functions 
is similar to that of vectors (see Sec. VII. 5). Namely, we say that 
several given functions are linearly dependent if one of them is a li- 
near combination of the rest. In particular, two functions z t (x) 
and z 2 (x) are linearly dependent if and only if z 2 ( x ) == Cz i (x) 
or z, (x) == Cz 2 (x). Thus, z 4 (x) and z 2 (x) are proportional to each 
other in this case. Hence, we see that formula (76) does not yield 
the general solution in such a case because 

CjZj -j- C 2 z 2 = 4~ CzC z i = {Ci C 2 C) Zj (x) = Dzi (x) 

where D = Cy -f- C 2 C is a constant. This means that although for- 
mally we have two arbitrary constants on the right-hand side of 
expression (76), these constants are not essential parameters since 
their number can be reduced by unity (see Sec. X.2). 

All the enumerated properties also hold for a homogeneous linear 
equation of any order of the form 

z («) -j_ p (x) z< n-1 ) -j- q (x) z<»-*> + s (x) z = 0 (78) 

with the only exception of the fifth property which must be replaced 
by the following assertion: the general solution of equation (78) 
has the form 

z = CiZj (x) -f- C 2 z 2 (x) -j- . . . + C n z n (x) (79) 

instead of (76). The functions z t (x), z 2 (x), . . ., z n (x) entering 
into formula (79) are any n linearly independent solutions of equation 
(78) and C j, C 2 , . . ., C n are arbitrary constants. Any family of n 
linearly independent solutions of equation (78) is called its fun- 
damental system of solutions. Thus, the general solution of equation 
(78) is a linear combination of solutions forming a fundamental system 
of solutions with n arbitrary constant coefficients. Using the termi- 
nology of Secs. VII. 17-19 we can say that the totality of all the 
solutions of equation (78) is an n-dimensional linear space. A fun- 
damental system of solutions is a basis in such a space. 

In conclusion we note that the left-hand side of equation (78) 
is a homogeneous function in the variables z, . . ., z( n-1 >, z( n > 
and therefore we can reduce by unity the order of the equation with 
the help of the method of Sec. 10 (see type 4). But this procedure 
is rarely performed because after the order is reduced the equation 
becomes non-linear. 

15. Non-Homogeneous Equations. We now consider a non-homo- 
geneous linear equation of the form 

V n + P (x) y' + q {x) y = / (x) (80> 

According to Sec. 14, let us denote the left-hand side by L [y]. 
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1. A particular solution of equation (80) being known, it is pos 
sible to reduce the problem of integrating the equation to the pro 
blem of integrating a homogeneous equation (/5) which corresponds 
to (80) [that is an equation of form (75) having the same coefficients 
on the left-hand side as equation (80) but with zero right-hand side, 
or, simply, an equation that is obtained from (80) by dropping its 
right-hand side]. 

Indeed, if Y ( x ) is such a solution then after the substitution 

y = Y (x) + z (81) 

where ' z = z (x) is a new unknown function we obtain 

L [Y + z\ = / (x) 

which suggests 

L [Y] + L.[z] = /(*) 


But L [Y] — / (x) (why is it so?). Hence we arrive at an equation 
of form (75) containing z as an unknown function. 

Thus, the general solution of non-homogeneous linear equation (80) 
is the sum of any particular solution of the equation and the general 
solution of the corresponding homogeneous linear equation (compare 
this result with the general solution of the first-order equation 
considered in Sec. 4). 

2. If the right-hand side / ( x ) is a linear combination of some 
functions, for instance, of two functions, that is f (x) = a /j (x) -j- 
+ P/ 2 {x) (where a and p are constants), and if certain particular 
solutions Yi (x) and Y z (x) of equation (80) having right-hand sides 
equal to / t {x) and / 2 ( x ), respectively, are known, then the function 

Y {x) = aY i {x) + PY 2 {x) 

is a particular solution of equation (80) with the right-hand side 
/ (*)• 

This simple property is in fact a special case of a general principle 
called the superposition principle (see Sec. XIV. 26). The proof of 
the property is left to the reader. 

3. If the general solution of homogeneous equation (75) is known 
the general solution of equation (80) can be found by quadratures. 

This is performed by means of a method discovered by Lagrange. 
It is called the method of variation of parameters (we used the 
method in Sec. 4 in solving the first-order non-homogeneous 
linear equation). As we know, the general solution of equation (75) 
is of form (76). By analogy with formula (22), we seek a solution 
of equation (80) in the form 

y = <Pi ( x ) z i ( x ) + (Ps (x) z 2 (a;) (82) 

34* 
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where qq and q> 2 are some functions yet unknown. But there are 
two such functions here and only one equation (80). Therefore, in 
order to find the functions, we shall impose one more additional 
condition which will be put down below [condition (84)]. 
Differentiating equality (82) we obtain 

y' = (<pi< + <p 2 Zs) -f (q>i z i + tX) (83) 

We now set the condition that the expression inside the second 
parentheses on the right-hand side of relation (83) should identically 
vanish: 

(Pl'Zi + q>2 z 2 = 0 (84) 

Then when differentiating equality (83), we must take into account 
only the expression entering into the first parentheses, that is 

y" = (q>i< + T 2 z D + (<pX + <p&) (85) 

Substituting the results (82), (83) and (85) thus obtained into equ- 
ation (80) and dropping the sum which equals zero we receive 

<Pi ( z i + P z 'i + Q z i ) + (pz ( z 2 T" P z 2 + 9 z 2 ) + 

+ (<p& + (fX) = / (x) 

(check up the calculations!). 

The functions Zj and z 2 satisfying equation (75), the first two 
parentheses in the last equation can be deleted. Thus we arrive 
at the equality 

<PX + % z 2 = / (x) (86) 

Hence, now we have two relations (84) and (86) for determining 
qq and cp 2 . The functions z lt z 2 and / ( x ) being regarded as known, 
we have a system of two algebraic equations of the first degree in 
two unknown quantities, i.e. in the derivatives qq ( x ) and qq ( x ). 
Solving the system we find the unknowns and then, integrating, 
we obtain qq and qj 2 . 

For instance, let us take the simplest equation of forced oscilla- 
tions which is obtained if we add an external force P (t) to the right- 
hand side of equation (3). Dividing both sides of the equation by M 
we derive 

/ + ©&=/(*) ( 87 ) 

where the notation a 2 0 = -^ and is introduced. 

The corresponding homogeneous equation 

z" -f- ffigZ = 0 


(88) 
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lias two linearly independent particular solutions of the form Zi = 
= cos fflo* and z 2 = sin co 0 t (this fact can he directly verified by 
substituting the solutions into the equation). The general solution 
can therefore be written as 

z = Ci cos co 0 i + C 2 sin © 0 * ( g 9) 

In particular, it follows that co 0 is the fundamental frequency 
(natural frequency) of the sj^stem in question, that is the frequency 
of free oscillations arising in the system when there are no external 
forces. 

According to formula (82), we seek a solution of equation (87) 
in the form 

y = <p ± (t) cos (o 0 t -f- q >2 (0 sin (o 0 t (90) 

Then equations (84) and (86) turn into 

cp' cos cd 0 £ + q>j sin co 0 ^ = 0 I 

(— co 0 sin oM) + cos cd d 2 — f (t) J 
From this we immediately find 

1 1 * 
cp( (t) = / ( t ) sin (£> 0 t and (p' (£) = — / (2) cos c o 0 t 

Now we must integrate the last relations. It is inconvenient to 
use indefinite integrals here because they contain arbitrary constants 
which are difficult to specify. It is therefore better to put down 
the results in the form of definite integrals with a fixed lower limit 
of integration and with a variable upper limit. Let the lower limit 
be equal to zero (we assume that t = 0 is the initial instant of time). 
Then we have 

t 

(*)= — ^ J f {t) sin a) 0 tdt-\- Ci 
o 

where C \ is an arbitrary constant. Here the variable t is understood 
in two-fold sense because the letter t designates the variable of 
integration and the upper limit as well. It is therefore more con- 
venient to use the fact that the value of a definite integral is inde- 
pendent of the notation of the variable of integration (see Sec. XIV. 3) 
and rewrite the expression of qjj (i) in the form 

t 

<Pi (0 = — — j / (t) sin coqT dx + Ci 
0 o 

The expression of q 2 (t) is found similarly: 

t 

<P2 ^ = j f ( T ) C0s “o' 1 + ^2 
0 
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Substituting <pi and cp 2 thus found into (90) we derive 


y = 


1 

(O 0 


l 

cos co 0 t | / (x) sin (OqT dr-j- 


l 

-j sin co 0 f \ / (t) cos <o 0 t dr — - Ci cos & a t -f- C z sin c o 0 i 

w 0 J 

0 

We now insert cos co 0 f and sin coo^ under the integral sign. We could 
not do this if we had not changed the notation of the variable of 
integration. But now this is permissible and thus we get the expres- 
sion 

t 

y — —\ [ — / (t) cos o) 0 t sin g) 0 t + 
w 0 J 
0 

-f- / (t) sin oi 0 t cos g) 0 t] dr-\-C t cos a 0 t C z sin t o 0 t 


in which both integrals are combined. From this we obtain the gene- 
ral solution of equation (87): 


f 

y = — f sino) 0 (t— ^/(TjdT-rCjCoscDot-f-CasincDflt (91) 
*>0 J 
0 

The arbitrary constants Ci and C 2 can be determined if we set, 
for example, the initial conditions 

y (0) = y 0 , y' (0) = v 0 (92) 


Substituting t = 0 into both sides of (91) we obtain y 0 = C^. To 
use the second condition (92) we must differentiate equality (91) 
with respect to t and then substitute t = 0. When differentiating 
the integral we must take into account that t enters into the inte- 
gral as an upper limit of integration and as a parameter under the 
integral sign. Hence, using formula (XIV.80). we deduce 

t 

y' = — f G3 0 cosco 0 (i — r)f(r)dr + 
w 0 J 
0 


~r sin o) 0 ( t — x ) f (x) sin a 0 t -f- C 2 w 0 cos co 0 t — 

t 

= j cos G3 0 (t — t ) / (t) dr — Ci o) 0 sin co 0 f -j- C 2 a 0 cos co 0 t 


Thus we obtain 
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Consequently, the particular solution of equation (87) satisfying 
the initial conditions (92) is 

t 

y = -^~ j sin <n 0 (t — t) / (t) dx ^-y 0 cos a> 0 t + —• sin (D 0 f 

All the enumerated properties hold for the general equation 

ijW + p ( ; x ) y (n ~ i) + q (x) U {n ~ 2) + - • • + s (x) y = f (x) (93) 

as well. The method of variation of arbitrary constants is applied 
to equation (93) in the following way: we substitute functions 
<Pj {x). «p a (x), • • <P„ (x) for C„ C 2 . . . ., C n , respectively, into 

formula (79) and then differentiate the. formula in succession up 
to the {n — l)th derivative inclusive. After each of the differenti- 
ations we obtain a group of terms containing the derivatives cp(, 
<p'. . . (p„, and we equate each of the groups to zero. Finalty, 

differentiating formula (79) [with C i, C 2 , . . ., C„ replaced by 
<Pi (x), (p 2 (x), . . ., cp„ (x)] the nth time and substituting all the 
expressions thus obtained into equation (93) we derive the nth 
relation connecting qq, . . ., (p„. Then <p[, . . ., (pJt are 

found by solving the linear algebraic system of equations etc. 

4. Any solution of equation (78) or of equation (93) can be exten- 
ded to any interval in which the coefficients and the right-hand 
side of the equation do not approach infinity. Generally, this is 
not the case for non-linear equations because it can happen that 
a solution (or its derivatives) approaches infinity for some finite value 
of x. A simple example of this fact is equation (38) in the case a > 1 
(see Fig. 288). Here the slope of the direction field increases so fast, 
as y increases, that integral curves travel into infinity after passing 
only a finite distance along the z-axis. On the contrary, a solution 
of a linear equation (for instance, the solution y = Ce Mx of the 
equation y' — My) cannot approach infinity for a finite value of x. 

16. Boundary- Value Problems. In the preceding section we studied 
problems in which we isolated a particular solution from the general 
solution by means of initial conditions which define certain values 
of an unknown function and of its derivatives for a single value 
of the argument. But there are some other ways of isolating a par- 
ticular solution from the general solution which are encountered 
in practical problems. At the same time it is common for all methods 
of determining a particular solution that the number of additional 
conditions imposed on a sough t-for solution must be equal to the 
number of degrees of freedom (see Sec. X.2) in the general solution 
of an equation in question, that is to the order of the equation. 

These additional conditions can be written, in the case of equa- 
tion (5) of the 72th order, in the form 

Gk [yl = ctfc (A: = 1,2,.. 


., n) 


(94) 
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where Ghly] (k — 1,2,..., n ) is a given combination of the 
values of the sought-for function y {x) and its derivatives taken 
for different values of the argument, in the general case, and a lt 
cc 2 , . . a n are given numbers. [More precisely, Gk [y] ( k = 
= 1, 2, . . ., n) is a given functional, the notion of a functional 
being mentioned in Sec. XIV.4.] For instance, if we take the case 
of initial conditions (11) then Gk [y] is simply equal to y( k ~ 1 '> (z 0 ). 

If general solution (8) of a given equation is known then to find 
the particular solution we are interested in we must substitute the 
expression of the general solution into conditions (94) which results 
in a system of n equations in n unknowns Ci, C z , - • C n . If 

Gk iCrl/i + C 2 y z ] = G \Gk [yil + C z Gk [ 1 / 2 I {Ci, C z = const) 

then conditions (94) are called linear. If, in addition, all a* = 0 
then the conditions are called homogeneous linear conditions. If 
we have functions (which may not necessarily be solutions of the 
differential equation) satisfying homogeneous linear conditions then 
any linear combination of the functions satisfies the conditions as 
well. In fact, if, for example, Gk [j/il = 0 and G k [r/ 2 ] = 0 then 

Gk I Ci\)i + C 2 I/ 2 I = Ci Gk [f/il + C z Gh [ 1 / 2 ] — Ct'Q + C z - 0 = 0 
The difference of two functions satisfying the same non-homogeneous 
linear conditions satisfies the corresponding homogeneous condition 
(that is a homogeneous linear condition with the same left-hand 
side). Let the reader verify the assertion! 

In our further discussion we shall limit ourselves to solutions 
of the equation 

V" + P (x) if + q (x) if = f ( x ) (a < x < b) (95) 

with the additional conditions 

y (a) = ct it y ( b ) = a z (96) 

although all the general conclusions we shall draw remain true for 
linear equations of any order n with additional linear conditions 
(94) of arbitrary form. The interval (a, b ) will be regarded as being 
finite. We shall also assume that the functions p (x), q ( x ) and / (a;) 
are finite. Then we can regard any solution as extended over the 
whole interval including its end-points (see property 4 in Sec. 15). 
Conditions of form (96) containing only the end-point values of 
functions defined over the interval in which the solution is sought 
for are called boundary conditions. The corresponding problem of 
determining the solution of the equation is called a boundary-value 
problem. 

The solution of the above boundary-value problem is found on 
the basis of the general solution of equation (95) which is 

y (x) = Y (x) + CjZj (x) + C 2 z 2 ( x ) (97) 
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where Y (x) is a certain particular solution of equation (95) and 1 
Sl and z 2 are two linearly independent particular solutions of the- 
corresponding homogeneous equation (see property 1 in Sec. 15). 
Substituting formula (97) into condition (96) we obtain two relations 
for Ci and C 2 : 

CjZj (a) -j- C z z 2 ( a ) = a i Y ( a ) 1 /Qg\ 

CiZi ( b ) -f- C 2 z 2 (5) = oc 2 — Y (6) J 

In solving this system of two algebraic equations of the first 
degree we can encounter the following two cases (see Secs. VI .4 
and VI.6): 

1. Basic case. The determinant of the system is unequal to zero. 
Since in this case system (98) has a certain uniquely defined solution,, 
equation (95) with conditions (96) possesses one and only one solu- 
tion for any right-hand member f ( x ) and any numbers cCj, a 2 . 

2. Singular case. The determinant of the system is equal to zero. 
In such a case system (98) is, as a rule, contradictory but it may 
have infinitely many solutions for certain specific right-hand sides. 
Hence, equation (95) with conditions (96) has, as a rule, no solu- 
tion when the function / (x) and the numbers a lt a 2 are chosen arbi- 
trarily but it has an infinitude of solutions when / (x), ctj, a 2 are- 
chosen in a certain specific manner. For example, we can easily 
verify that if / (x) and a t are given beforehand then there are infini- 
tely many solutions only for one specific value of a 2 , the problem 
having no solutions for any other value of a,. 

It should be noted that it is the form of the left-hand sides of 
equation (95) and of conditions (96) that determines which of the- 
above cases takes place. 

According to Sec. VI.6, the basic case takes place if and only if 
the corresponding homogeneous problem [in which we must put 
/ (x) = 0, a t = 0 and a 2 — 0] has only the zero solution. In the 
singular case the homogeneous problem has infinitely many solu- 
tions. If the non-homogeneous problem possesses at least one solu- 
tion then adding the general solution of the corresponding homoge- 
neous problem to the particular solution of the non-homogeneous 
problem we obtain the general solution of the non-homogeneous 
problem. 

When we deal with an initial-value problem (Cauchy’s problem) 
we always have the basic case because the solution always exists 
and is unique. In, solving a boundary-value problem we can encoun- 
ter the singular case as well. For instance,' let us consider the follow- 
ing problem containing a constant parameter X — const: 

y" + fof=J(x) (o < X < l), y (0) = a,, y (Z) = a 2 (99> 

Let us first take the case X > 0. Then the functions z, (x) = cos ]/~Lr 
and z 2 (x) = sin ]/ Xx are two linearly independent solutions of the- 
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corresponding homogeneous differential equation and hence the 
determinant of system (98) is equal to 

2 , (0) z 2 (0) 

Zi (Z) z 2 (Z) 

Equating the determinant to zero we find the values 


1 0 
cosj'OtZ sinVXZ 


= sin Y ^ £ 


( n \ 2 

l 2a \2 

( 3-t \ 

lx) • 

lx) ’ 

l l 1 


( 100 ) 


for which there is a singular case for problem (99). This means that 
either the existence or the uniqueness of the solution is violated. 

The set of values of a parameter entering into the statement of 
a problem for which the problem degenerates in a certain sense 

(see Sec. II. 8) is called the 
spectrum of the problem. We 
suggest that the reader should 
verify that in the case 1^0 we 
always have the basic case for 
problem (99). This means that 
the set of the values (100) is the 
spectrum of the problem. 

The result obtained above has 
an important application to the 
problem of investigating the sta- 
bility of an elastic bar when it is 
subjected to buckling by a central force. Let a homogeneous (i.e. 
having the same properties in all its parts) elastic weightless bar 
be placed along the Y-axis. Suppose that the bar is compressed by 
a force P and that its ends can rotate about the points of support 
which are permanently kept on the Y-axis (see Fig. 295a). When the 
force increases and attains a certain critical value P cr the bar buckles 
and takes the form depicted in Fig. 295b. Let us denote the trans- 
verse deviation of a point x of the bar from its original position 
by y. As it is proved in courses on strength of materials, the function 
y (x) satisfies, within a sufficient accuracy, the differential equation 
and the boundary conditions 

y n +ilry= 0 ' z/(°)=z/(z)=o (ioi) 

where E is the so-called modulus of elasticity (also called Young’s 
modulus after T. Young, 1773-1829, an English physicist, physician 
and astronomer, one of the creators of the wave theory of light) 
and J is the moment of inertia of the cross section of the bar. As 
follows from (100), there is a basic case for problem (101) when 

( ,02 > 




(a) 
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This means that if condition (102) is fulfilled the problem has only 
the zero solution, that is there is no buckling in this case. When 
inequality (102) turns into the equality, as P increases, there appears 
•a singular case. Then, besides the zero solution, problem (101) 

possesses solutions of the form y = C sin ~ x where C is an arbitrary 

constant. In this case there are no forces that can keep the bar in 
the rect ilin ear equilibrium state and therefore even arbitrarily 
small external actions can result in finite deviations from this 
state. This means that the bar becomes unstable. The expression 



of the critical force was found by Euler in 1757. One may think that 
after a further increase of the force, when we have P >* P cr , the 
bar will again become rectilinear but this conclusion is incorrect. 
The fact is that equation (101) is applicable only to the limiting 
case of small deviations from the rectilinear state. A more compre- 
hensive investigation of the exact non-linear equation describing 
the state of the bar for any deviations shows that after P exceeds 
P cr there appears a new curvilinear equilibrium state which is 
stable and which exists together with the unstable rectilinear equilib- 
rium state. The curvature of the curvilinear equilibrium form rapidly 
increases, as P increases, and finally the bar breaks. 

The notion of an influence function (Green’s function) introduced 
in Sec. XIY.26 can be applied for solving the non-homogeneous 
equation with the homogeneous boundary conditions 

y" + p (®) y' + q (*) y — f (x) (a < x < 6), 

y (a) = 0, y (b) = 0 (103) 

in the basic (non-singular) case. Indeed, we can interpret the func- 
tion / ( x ) as an “external” action. Then y (x) is interpreted as a result 

•of the action, i.e. y {x) = / (x). The superposition principle is also 
applicable here (why is it so?). 

According to Sec. XIY.26, let us denote the solution of problem 
(103) in which the delta function 6 (a; — £,) substitutes for / (x) 
by G(x, |); then the solution of problem (103) for an arbitrary 
function f (x) is expressed in the form 

b 

y (*) = J / (I) G (. X , i) d% (104) 

a 

We now take a simple example. Consider the problem 

y” =f(x) (0 < x < l), y (0) - y ( l ) = 0 (105) 
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If we substitute 5 {x — £) for f ( x ) we simply obtain y" = 0 for 
0 ^ x < £ and for g < x <1 Z. Thus the solution is 

y = ax -f- 6 for 0 ^ x <C £, and 

y = cx 4- d for £ <C x ^ Z 

where a, 6, c and cZ are some constants. The boundary conditions- 
imply that 6 = 0 and cl -f- d — 0, i.e. 

y — ax for 0 < x < £ and , g 

y = c (x — l) for | < x ^ Z 

If we integrate the equality p" = 8 {x — g) from x — % — 0 to 

x = £ + 0 we get p' (£ + 

4- 0) — y' (£ — 0) = 1. By the 
w T ay, the result would be the 
same for the left-hand side of 
equation (103) because integra- 
ting a finite function over an 
interval of zero length results in 
zero. The repeated integration of 
the delta function yields a con- 
tinuous function (see Sec. XIV.25) 
and therefore we have y (£ — 
— 0) = y (| -j- 0). Hence we 
obtain c — a — 1 and a£ = c (£ — Z) from (106). It follows that 



a — 


l-t 

l ’ 


c 


I 

l 


Substituting the values of a and c into (106) we find the influence 
function of problem (105): 


G(x, £) = 


l 

IQ-*) 

i 


for 0 x < £ 
for £< x<Z 


The function is shown in Fig. 296. On the basis of formula (104) 
we obtain the formula of the solution of problem (105) valid for 
any function / ( z ): 

i * 


p= J G(x, £)/(£)d£= J G(x, £)/(£) d£-f 


l X 

+ J G (x, |) / (£) d£= J £/(£) d£ — 

X 0 
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§ 5. Linear Equations with Constant Coefficients 

Linear equations with constant coefficients form an important 
class of differential equations whose integration can be completed 
in a comparatively simple way. 

17. Homogeneous Equations. For definiteness, we now consider 
the third-order equation 

z m -f ap." -f a 2 z' -f- a 3 z — 0 (107) 

where a u a 2 and a 3 are constants. Euler’s idea is to seek a particu- 
lar solution of the equation in the form 

z = e vx (108) 

where p is a constant that must be appropriately chosen. 

Substituting (108) into (107) we see that 

e px (p 3 + ffliP 2 + a 2 p + a 3 ) = 0 
The first factor being unequal to zero, we obtain 

P 3 + GiP 2 + GzP + a 3 = 0 (109) 

Thus, a function of form (108) satisfies equation (107) if and only 
if p satisfies equation (109). Algebraic equation (109) in the un- 
known quantity p is called the characteristic equation of equation 
(107). The left-hand side of the characteristic equation (109) is 
called the characteristic polynomial of equation (107). The degree 
of a characteristic equation is equal to the order of the corresponding 
equation. 

Equation (109) has three roots (see Sec. VIII. 8) which we denote 
as pj, p 2 and p 3 . There can he different cases here, namely, the follow- 
ing ones. 

1. Let all the roots he real and simple (that is distinct from each 
other). Then, by formula (108), we have three particular solutions 
of equation (107) of the form 

Zj - e^ ix , z 2 — e v * x and z 3 — e^ x 

The solutions being linearly independent (i.e. none of them being 
a linear combination of the rest), the general solution of equation 
(107), as it is implied by Sec. 14. has the form 

z = C^ i* -f- C 2 e^ x + C 3 e?* x (110) 

2. Let all the roots be simple but let there be imaginary roots 
among them. Then there appear complex functions of a real argu- 
ment on the right-hand side of formula (110) (see Sec. VIII. 6). But 
it is easy to show that the whole theory of linear equations (see § 4) 
can be automatically extended to the case when all the coefficients 
are complex functions or numbers and all the solutions are complex 
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functions of a real argument. Formula (110) can therefore be applied 
to the case of imaginary simple roots of equation (109). It is apparent 
that the arbitrary constants are also complex here in the general case. 

But when we consider real functions it is also preferable to obtain 
the answer in the real form. To attain this we can take advantage 
of the following assertion: if a homogeneous linear equation with 
real coefficients has a complex particular solution the real part 
and the imaginary part of the solution are solutions of the same 
equation. Actually, if L [i^ -f- W 2 1 = 0 ( see Sec. 14 on the notation) 
then L [ y i ] -f- iL [y 2 ] = 0 from which we deduce L [ y t ]—0 and 
L [y 2 ] — 0 (why?). 

Hence, if the coefficients of equation (107) are real and if it has 
a particular solution 

e (T+i s)x _ e rx cos sx l e rx gj n Sx 

[see formula (VIII. 12)] then the functions 

e rx cos sx and e TX sin sx (111) 

are also solutions of equation (107). Since complex roots of an alge- 
braic equation with real coefficients form conjugate pairs of complex 
numbers (see Sec. VIII. 8), in our case 2 there are two roots of the form 

Pi — f + is and p 2 — r — is 

whereas the third root p 3 is real. Hence, the general solution can 
be written in the form 

z = Cie TX cos sx -f- C 2 e rx sin sx -j- C 2 e VsX (112) 

instead of (HO). 

For instance, taking equation (88) which describes free oscilla- 
tions we obtain the characteristic equation p 2 + coE = 0 with the 
roots p 1>2 = ±i(D 0 = 0 ± ico 0 . Thus, the general solution of the 
equation is analogous to formula (112): 

z — Cie ot cos (i) 0 f + C 2 e Qt sin 01 

Thus, we have obtained the solution of the equation in form (89). 

Formula (112) is sometimes rewritten in another form by taking 
advantage of the formula 

Ci cos sx + C 2 sin sx = M sin (sx + a) 

To obtain this expression we must put 

C i = Msina, C 2 = M cos a, M = ]/ C\-\-C\ and tana = -^- 

(compare with Sec. 1.29). Then, instead of (112), we get 

z = Me TX sin (sx + a) + C 3 e^ x (113) 

where the role of arbitrary constants is played by M, cc and C 3 . 
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3 Let there he multiple roots among the roots of characteristic 
equation (109). For example, let p 2 = p i and pi^Pi- Then, of 
course, formula (110) does not yield the general solution since for- 
mula (108) gives only two distinct solutions in this case (i.e. e PlX 

and e 53 *). . 

To find a third solution we begin with the case when p 2 — Pi + 
where | Ap | 0 is small. Then equation (107) has the solution 

e p 2 x = gPix^p-x __ e vix ^ 1 + Ap . X + ^ A P), L -U. . . . j 

together with the solution e PIX [here we have used expansion 
(IV.55)]. Therefore the linear combinations 

e v - x — e pix = e pix ( Ap • x + -f . . . ) 

and 

(H4> 

are also solutions of the equation. 

After dividing by Ap we can pass to the limit, as A p — 0. Then 
all the terms containing Ap tend to zero, and therefore we obtain 
the solution xe Pl - x for Ap = 0. that is for p 2 — p t . Hence, in this 
case equation (107) has the general solution of the form 

z = C r eP i* + C 2 xe p ' x + C 3 e p > x 

Similarly, in the case p i = p 2 = p 3 equation (107) has the solu- 
tion e p i a ' and, besides, the solutions xe PlX and x 2 e p i- r . To prove this 
we can take the second divided difference instead of (114) and pass 
to the limit (see Sec. Y.7). Therefore the general solution for this 
case has the form 

z = CjePi* + C z xePi x + C 3 x 2 e p i x 

Equations of an arbitrary order are investigated in a similar way. 
If a root p of a characteristic equation has a multiplicity k then 
the functions 

e px , xe px , . . ., £ ,1 - 1 e px 

are particular solutions of the corresponding differential equation. 
If a pair of complex conjugate roots r ± is is of multiplicity k then 
the functions 

e TX cos sx, e TX sin sx, xe rx cos sx, xe TX sin sx, ... 

. . ., x h-1 e r * cos sx, x k ~ x e rx sin sx 
are particular solutions. 

Thus, the only practical difficulty in integrating a homogeneous 
linear equation with constant coefficients lies in solving the corres- 
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ponding characteristic equation which can he performed by means 
•of the methods described in Secs. V.2-5 and VIII. 9. 

As an example, we now consider free oscillations of a material 
point when there is a linear law of elasticity and an additional force 
of viscous friction which is proportional to the first degree of the 

velocity. In this case we must add the term — / ^ to the right- 
hand side of equation (3) where / is the coefficient of viscous friction. 



After transposing all the terms to the left-hand side and dividing 
by M we obtain the equation 

z" -f 2 hz' + a>lz = 0 (115) 

similar to (88) where 2 h = ~ jj • 

In solving the characteristic equation 

p 2 -f- 2 hp -f- ©j; = 0 (116) 

we can encounter two basic cases. If h ■< co 0 , that is if the friction 
is comparatively small, equation (116) has the solution 

pt,z = — h ± "J/ h 2 — (0* = — h±i V w'o — h 2 

and the general solution of equation (115) is therefore analogous 
to (113): 

z (t) = Me~ hl sin (wt + a) 
where © = ]/©;; — h 2 . 

We see that the presence of small friction yields damped oscilla- 
tions whose amplitude decreases according to an exponential law 
■(because of the factor e~ hl ). Besides, the friction decreases the fre- 
quency (since © < © 0 ). The graph of the solution is depicted in 
Pig. 297. Zeros of the solution are defined by the factor sin (c ot -f- a) 

and therefore they are equally spaced. After the time period T = ~ 
olapses the sine repeats its value and the function e~ hl “educes” the 

-A— 

factor e ° which causes damping. The time interval T is often 
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called the “period’ - of the oscillations although the function z ( t ) 
is a non-periodic function here because during each subsequent 

“period” the amplitude of z decreases e “ times (whereas the general 
form of the function does not change). 

If h > co 0 . that is if the friction is comparatively large, equation 
(116) has real roots, and equation (115) has the general solution 

2 ( t ) = C 1 e p i t -f Coe p J = Cje-d*- Vw^l) < _j_ t 

Only the first summand is essential here for large t (why?). Hence, 
we obtain an exponential damping without oscillations in this case. 
This is the so-called aperiodic damping. 

Theoretically, there is a possibility of a “bordering” case when 
h = co 0 . Then equation (116) has a double root. We leave it to the 
reader to verity that in this case we also have aperiodic damping. 

18. Non-Homogeneous Equations with Right-Hand Sides of Spe- 
cial Form. We now turn to non-lxomogeneous linear equations with 
constant coefficients. For instance, let us take a third-order equation 
of the form 

y mj r aiy" + a z y' -f- a 3 y = / ( x ) (a,, a z , a 3 = const) (117) 

The corresponding homogeneous equation being always solvable 
(see Sec. 17), the considerations of Sec. 15 imply that it is only a 
single particular solution of equation (117) that we have to find. For 
the right-hand side of the general form this can he achieved by the 
method of variation of arbitrary constants (see Sec. 15). But there 
exists an important and a rather wide class of right-hand sides of 
a special form for which a particular solution can be found consi- 
derably faster and simpler by applying the method of undetermined 
coefficients. 

We first take the equation 

l f + a i ll" 4- a zy' + a 3 y = Ke lx ( K , X = const) (118) 

It is natural to seek a particular solution of the equation in the form 

y = A#* ( 119 ) 

where the constant A is yet unknown. Substituting (119) into (118) 
we obtain 

A%?e y - X + Aatfete + Aa 2 le>- X + Aa z e*- X = Ke>- X 
This results in 

= and y = JL- e ?.x (120) 

(after cancelling out the factor e**) where P {X) designates the cha- 
racteristic polynomial. 


35— 014i 
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The result is valid if P (X) 0, that is if k is not a root of the 

characteristic equation. If P (k) = 0 then function (119) satisfies 
homogeneous equation (107) and therefore it cannot satisfy equa- 
tion (118). 

Let P (k) = 0 and P' (k) 0 which means that k is a simple 

root (see Sec. VIII. 8) of the characteristic equation. Then, following 
the same way of reasoning as in Sec. 17 when we considered the case 
of multiple roots, we substitute ki = k + a for k into the right- 
hand side of equation (118) where | a ( is small but different from 
zero. The value 7^ is no longer a root of the characteristic equation 
and therefore formula (120) suggests that equation (118) has a par- 
ticular solution of the form 


K 

fW 


pi. IX = 


K 


P{k+a) 

K 


e (/.+a) -t . 


K 


P (k-ra) 




0 + 


ax - 


2i 


-)= 


P (k-j-a) 




i k a - e’- x ( x ' — 1 ) 

1 P(/. + a) \ ‘ 2! ' ‘ ‘ ‘ ) 


The first summand on the right-hand side of the last relation 
satisfying the corresponding homogeneous equation, the second 
summand is also a particular solution of equation (118) in which 
k = k t (why is it so?). We now pass to the limit in the second sum- 
mand as a — *- 0. Applying L’Hospital’s rule to calculating the limit 

lim „ ,. g — r we obtain the expression 
a ->0 P (>■-*) 

y=p^ xeXx (i2i) 


(check it up!) which is a particular solution of equation (118) for the 
original value of ?v. 

In a similar way we investigate the case when k is a double root 
of the characteristic equation and find that in this case there is 
a particular solution of equation (118) of the form 


y 


K 

P" (>■) 


2 P>.* 


x-e‘ 


and so on. 

By means of a similar but a little more complicated procedure we 
can prove that the equation 


y"’ + a t y" + a 2 y' + a 3 y = Q m (x) (122) 


where Q m ( x ) is a given polynomial of degree m possesses a particu- 
lar solution of the form 

y = R m {x) e>- x (123), 

provided k is not a root of the characteristic equation [/? m (x) denotes 
some other polynomial of degree m}. The solution can also be found 
by means of the method of undetermined coefficients. To apply the 
method we write expression (123) with literal coefficients and'sub- 
stitute it into (122). Then the usual procedure of equating coeffi- 
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cients in similar terms, based on the identical equality of both 
sides of the relation, yields the numerical values of the coefficients 
entering into (123). Similarly, if X is a root of the characteristic 
equation of multiplicity k then there is a particular solution of the 
form 


V — x h R m (x) e 


Xx 


The case X = 0 is not excluded here and therefore we can also 
obtain a particular solution when there is a polynomial of degree 
m on the right-hand side of equation (122) (without an exponential 
factor). 

We can also consider the equation 


y'" + aii/ + a 2 y' + a 3 y = Q m (x) e» x cos vx 


Since we can rewrite the right-hand side in the form 


Qm (x) e> ix - 


Qm Q) „(u 4 -i \0 a; i Qm ( x ) 
2 2 


g(H-iv) x 


(see Sec. VIII.4), formula (123) suggests that if h — p ± iv is not 
a root of the characteristic equation, a particular solution can be 
sought for in the form 

V ~ R m (x) e(M-Hv)* _j_ ( x ) g(u-jv)* = 

— R m {x) e !la (cos vx + i sin vx) -f- R m (a:) e ,lx (cos vx — i sin vx) — 
— li? m (x) -f- R m (x)] e> ix cos vx + liR m (x) — iR m (x)] e* x sin vx 

Changing the notation we arrive at a particular solution of the form 

y — T m (x) e^ x cos vx -{- S m (x) e> lx sin vx (124) 

where T m (x) and S m (x) are polynomials of degree m which can 
be found with the help of the method of undetermined coefficients. 
In the case the right-hand side is of the form 

Q m (x) e^ x sin vx or Q m (x) e‘ lx cos (vx + a) or 
Q m (x) e* 1 * sin (vx + a) (a — const) 

we can also seek a particular solution in form (124). 

If X — p ± iv is a root of the characteristic equation of multi- 
plicity k the right-hand side of formula (124) should be additionally 
multiplied by x k . 

For instance, let us consider equation (87) with a sinusoidal right- 
hand member K sin a>t: 

y" + ®o y = K sin CO t (125) 

The equation describes forced oscillations under the action of an 
external sinusoidal force of frequency co. 

According to formula (124), if does not coincide with 

any root of the characteristic ‘equation, i.e. if do ^ g) 0 , a particular* 

35 * 
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solutioa can be found in the form 

y — A cos at A- B sin oi t 

Substituting this expression into equation (125) we obtain 
— A oi 2 cos (at — Bco 2 sin at + Ao)l cos at + 

+ Ba 2 0 sin at = K sin at 

The last equality being an identity, we must have 

— A a 2 A- A oil = 0 and — B a 2 + Uoio = /T 
Hence, A — 0 and B =-= and thus we obtain 

at ( 126 ) 

To obtain the general solution of equation (125) we must add 
the general solution of equation (88) to expression (126). Consequent- 
ly, if the frequency of the external force is different from the funda- 
mental frequency of oscillations we have a superposition of two 
harmonic oscillations of forms (89) and (126). Function (126) 
describes the so-called forced oscillations whose frequency coincides 
with the frequency of the external action and whose amplitude and 
phase have completely specified values. Function (89), which is 
a solution of equation (88), describes free oscillations whose am- 
plitude and phase depend on the initial data. 

Formula (126) shows that when a =#= oi 0 and is close to co 0 the 
amplitude of forced oscillations becomes very large. But if co = coo 
then, according to the general theory, a particular solution of equa- 
tion (125) can be found in the form y = At cos o ) 0 t + Bt sin ant. 

The substitution of the last expression into the equation results 
in the formula 

^- 2 ^ C0S ^ 

which can also be deduced from (126) by analogy with formula (121). 
(Let the reader verify the validity of the above expression.) 

We see that if the frequency of a harmonic external action is 
equal to the fundamental frequency of oscillations the amplitude 
of forced oscillations increases according to a linear law. This impor- 
tant phenomenon is well known in physics and engineering and is 
called resonance. 

19. Euler’s Equations. Euler’s equations are of the form 

(ax + b) n yW + c s {ax -f- 5) n_1 y in-1> 

+ a n-i {ax + b) y' + a n y = / {, x ) 
where aj, a 2 > • • •> a n are constant coefficients. 
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Euler’s equation can be easily reduced to a linear equation with 
constant coefficients by means of the change of the independent 


variable 

| ax + b | = e\ i.e. t = In | ax + b J 

For the sake of simplicity, let us suppose that ax + b 
take a homogeneous second-order equation: 

(ax + b) z y" + fli (ax + b) f + a 2 y = 0 

After the independent variable is changed we obtain 


ax - 


i — In (ax-\-b), 


dy dy dt 


dy 


n r — "v ~ ^ . rip 

J dx dl dx dt 


,-i 


and 


Substituting the results into equation (127) we receive 

fl2 (U-lt) = ° 


■ 0 and 

(127) 

(128) 


This is an equation with constant coefficients which can be solved 
by means of the methods described in Sec. 17, i.e. we must put 

y - e» l (129) 

Then we solve the corresponding characteristic equation etc. and, 
finally, turn back from t to x. 

It is possible to avoid substitution (128) because (128) and (129) 
imply that 

y = (ax + b) v (130) 

We can therefore directly substitute (130) into (127), i.e. we can 
seek a solution in form (130). This yields a characteristic equation 
for determining p, the degree of the equation being equal to the 
order of equation (127). But one must take into account that, in 
case there are multiple roots of the characteristic equation, equation 
(127) possesses solutions of the form 

y = te vi = (ax -j- b) p In (ax + b ) 

besides solutions of form (130) (the form of these additional 
solutions depends on the multiplicity of the root; see case 3 in 
Sec. 17). 

20. Operators and the Operator Method of Solving Differential 
Equations. The notion of an operator was introduced in Sec. XIV. 26. 
We also considered some examples of operators including the ope- 
rator of differentiation D. In Sec. 14 we introduced a differential 
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operator L. Further examples of operators are the shift operator T 
and the difference operator A which are defined by the formulas 

T f (x) — / (x + h) and A f (x) = f {x + h) — f ( x ) (131) 

where h is a given step. We also introduce the operator of multi- 
plication by a number C which we denote by the same letter C (in- 
cluding the unit operator 1 which does not change a function and 
the zero operator 0 which transforms any function into the function 
which is identically equal to zero). There is also an operator of 
multiplication by a given function and so on. 

Operators can be added together and multiplied by numbers ac- 
cording to the following rule which looks quite natural: if A and 
B are operators and a is a number then 

(A + B) / = A/ + B/ and (ccA) / == a (A f) 

For instance, equality (131) suggests that 

A = T — 1 and T = 1 + A 

All the axioms of linear operations hold for these rules of addition 
of operators and multiplication by a number (see Sec. VII. 17). 

We can multiply an operator by another one according to Sec. XI.6: 
if A and B are operators then AB is a new operator defined by the 
formula 

(AB) / == A (B/) 

which means that to obtain AB/ we must first apply the operator 
B to the function / and then apply operator A to the result. We can 
easily verify the following rules: 

A (BC) = (AB) C and (ocA + (3B) C = aAC + pBC (132) 

where a. p = const. But the equality AB = BA may not hold, 
that is in the general case the result of performing two operations 
may depend on the order they are performed. But nevertheless in 
particular cases we can have AB — BA and then we say that the 
.operators commute with each other. For example, all the above 
operators D, T, A and C commute with each other because we have 

DT / (x) = D (T / (x)) = D (f(x + h))=f(x + h), 

TD/ (x) = T (/' (x)) = f (x + h) etc. 

On the other hand, the operator of differentiation of a function and 
the operator of multiplication of a function by a given function 
do not commute (the reader should verify it!). 

The first property (132) enables us to write ABC instead of (AB) C 
or A (BC). Thus, ABC is an operator whose action upon an object 
reduces to performing the operations C, B and A in succession. An 
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operator of the form ABCD is defined similarly and so on. If v^e 
take equal factors then we arrive at powers of an operator: A-, A , 
A 4 etc. Hence, A n designates the repeated application of the opera- 
tor A. For instance, we have D "f = /", and A 2 / is the second diffe- 
rence (see Sec. XI. 6) etc. 

An operator A is called a linear operator if 

A (A + /a) = A (/j) + A (/a) and A (a/) = aA (/) (133) 

where a = const (see Sec. XI. 6). It is natural to interpret the first 
property as the principle of superposition, and the second property 
can he deduced from the first (see Sec. XIV. 26). Even in the case 
when the explicit expression of an operator is unknown the validity 
of the superposition principle indicates the linearity of the operator 
which enables us to draw some useful conclusions; for example, we 
can speak about its influence function (see Sec. XIV. 26) in such 
a case. 

Both properties (133) can be put down as 

A (a/i + p/ 2 ) = «A/! + pA/ 2 

where a and p are constants. It is easj^ to verify the following pro- 
perty of a linear operator A: 

A (aB + pC) = aAB + pAC (134) 

where B and G are arbitrary operators. To deduce the property one 
should apply both parts of (134) to an arbitrary function / and show 
that the results will be the same, namely, equal to aA (B /) 4- 
+ PA (C /). 

All the operators we have taken here as examples are linear. 
Indeed, as we know, the derivative of a sum equals the sum of the 
derivatives and so on. An example of a non-linear operator is the 
operator of squaring a function or the operator of forming the ab- 
solute value of a function (verify that the operators are non-linear!) 
and the like. In performing linear operations on linear operators 
and the operation of multiplying operators by each other we can 
use rules of elementary algebra but at the same time we must pay 
attention to the order of factors. For instance, (A + B) 2 = 

== (A -{- B) (A -j- B) = A" -f- AB 4 - BA -{- B 2 . This assertion is 
suggested by (132) and (134). 

In addition, if the operators commute with each other the order 
of factors does not matter either. For example, in such a case we 
have (A + B) 2 = A 2 + 2AB + B 2 and so on. 

We can also consider power series (see Sec. IV.16) of operators, 
ror instance, we can define the operator e A as 

e'=l+A+£ + £+... (135) 

and so on. 
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Generally, such a series or any other operator cannot be applied 
to all functions because usually there is a certain class of functions 
for which an operator makes sense, and thus it should be used only 
for these functions. 

As for example (135), we see that the greater the number of the 
terms taken, the more accurate the result. Theoretically, the exact 
result is obtained only in the limiting process. From the practical 
point of view this means that the number of terms must be sufficient- 
ly large for the result to be regarded as exact. 

Taylor’s series 

i(x+h) = f(x) + f ^ph + ^h*+ ... 
see Sec. IV. 16) can be written in the form 

T/== ( 1 +^ /l+ §r /l2 +"-) /=e * D/ 

This implies a relation between the operators T, A and D, namely 

T = e AD and A = e hD — 1 
The inverse formula 

[which corresponds to formula (IV. 61) of expansion of the natural 
logarithm] is nothing but formula (V.32) of numerical differentia- 
tion. 

We can consider an operator equation of tbe form 

A y = / (136) 

where / is a given function and the function y is sought for. If there 
is a solution of equation (136) it is natural to denote it as y — A -1 /. 
In case the operator is linear the equation is also said to be a linear 
equation. We can immediately extend properties 1-3 in Sec. 14 
and property 1 in Sec. 15 to general linear equations. But it should 
be taken into account that there are cases when a homogeneous 
equation has infinitely many linearly independent solutions and 
cases when a non-homogeneous equation has no solutions. 

We shall demonstrate the application of the operator of differen- 
tiation to solving linear equations with constant coefficients of 
form (117) (the so-called operator method of solving equations). The 
equation can be rewritten as 

(D 3 + (ZiD 2 -f a 2 D + a 3 ) y = / (x) 

There is a linear differential operator of the third order with con- 
stant coefficients inside the parentheses. We can factor the operator 
into linear factors according to algebraic rules (see Sec. VIII.8): 

(D — pt) (D - p 2 ) (D — p 3 ) y = / (x) (137) 
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where p u p 2 and p 3 are the roots of the characteristic equation (see 
Sec. 17). Since 

D {e~ px y) = (e~P x y)' — —pe~P x y + e~ px y’ = 

= e~ px 0 y ' — py) — e~ vx (D — p) y 

we have 

(D — p) y = e px D (e- px y) 

and therefore equation (137) can be rewritten in the form 

ePi*D (e~ pix e p - x D (e" p - x e P3X D (e~^ x y))) = f (x) 

Transposing the factors from the left-hand side to the right-hand 
side and taking into account that the equality D y = z is equivalent 

to the equality y — J z dx we obtain the general solution, of the 

original equation: 

y = e v * x j x ( [ e (v ^ x ( [ e~^ x / (x) dx} dx) dx 

It is apparent that after integration we obtain the same result 
as in Secs. 15 and 18 although we have used a different approach. 
There are more complicated problems for which the operator method 
may be essentially useful. We must note that such a simple facto- 
rization of an operator into a product of several operators of lower 
order can rarely be applied eSectively to linear differential operators 
with variable coefficients and moreover to non-linear operators. 

We now consider one more simple example. Let it be necessary 
to find a solution of the equation 

y" + 09 "y = j (s) 

where all the quantities are regarded as being real. We write, in 
succession, 

(D 2 + c cr)y = f (x), ( D — ico) (D + id a) y = f(x ), 
e i(ox D (<H»* (D + ia) y) = f (x), 

(D J ri®)y = e mx { [ e- iax f(x)dx and a»j = Im ^e i<BK ^e~ imx f (x) dx) 

(the sign Im designates the imaginary part according to the notation 
introduced in Secs. VIII.l and VIII. 6). Finally, 

y = — Im ^ e~ iax f (x) dx) 

§ 6. Systems of Linear Equations 

21. Systems of Linear Equations. For definiteness, let us consider 
a homogeneous linear system of three first-order equations in three 
unknown functions y (x), z (x) and u (x). We take the system in the 
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form solved for the derivatives of the functions: 

y' = ( x ) y + bi(x) z + c t {x) u 'J 

z' = a 2 (a) y -f b 2 (x) z -f c 2 (a:) u l (138) 

u' = a 3 ( x ) y + b 3 (x) z + c 3 {x) u J 

We remind the reader that a system of any order can be reduced to 

a first-order system (see Sec. 11) and that the operation of resolving 
a system -with respect to the derivatives is performed algebraically 
without solving the differential equations. 

We can easily pass from system (138) to an equivalent third-order 
equation (see Sec. 11) which is also a homogeneous linear equation. 
Therefore all the properties of homogeneous linear equations enume- 
rated in Sec. 14 are extended to system (138). The sum of two solu- 
tions y = y lt z = Zj, u = u l and y = g 2 , z = z 2 , u = u 2 should be 
understood as a new solution of the form y = j/i -}- p 2 , z = z t -f- z 2 , 

u = u y + u 2 . Similarly, the product of a solution y = y u z = z t , 

u — by a constant C is the new solution y = Cy u z — Cz u u = 
— Cuj. Hence, linear operations on solutions are performed here 
in the same way as on vectors (see Sec. VII. 10). 

In particular, the general solution of system (138) has the form 

y = Cyji + c 2 y 2 + C 3 y 3 -J 
z = + C 2 z 2 + C 3 z 3 ^ (139) 

U = C iUi -|- Cnllv 0 3 u 3 J 

where C { , C 2 and C 3 are arbitrary constants and (y 1 , z t , zij), (p 2 , z 2 , 
Uo) and (y s , z 3 , u 3 ) are three linearly independent solutions of system 
(138), that is such solutions that none of them is a linear combination 
of the rest. 

Let us dwell in more detail on property 4 in Sec. 14. If a nonzero 
solution (z/j, Z|, Ui) of system (138) is known then making the sub- 
stitutions y — y t y, z = z t z, u — u t u and y — u v, z = u + w 
we easily derive a system of two equations of the first order in two 

unknown functions v (x) and w (x) from which u is found by a single 
integration. 

The verification of the last assertion is left to the reader. 

All the properties enumerated in Sec. 15 are also extended to 
non-homogeneous linear systems of the form 

y' = fli (z) y + h, {x) z -f ci (x) u + f x (x) 'j 

z' = a„ (x) y + b 2 (x) z + c 2 (x) u + f 2 (x) i (140) 

u' = a 3 (x) y + b 3 (x) z + c 3 (x) u + f 3 (x) J 

In particular, the method of variation of arbitrary constants (Sec. 3) 

is applied in the following manner. Let general solution (139) of the 
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corresponding homogeneous system (138) be known. Then a solution 
of system (140) is sought in the form 

y — q?i (x) Vi + qp 2 (z) V 2 + 93 ( x ) Us 

z = (pi ( x ) Sj -j- (p o (x) Zn -F 93 i x ) z 3 * 

U = tpi (x) Ui + 9 2 (x) «2 + 93 ( X ) U 3 . 

After substituting these expressions into (140) we obtain the system 

9il/i “1“ 9*1/2 ~i" 9^3 = fi ( x ) 1 

9i z i 9* z 2 "t~ 9 3 z 3 ~ fs ( x ) r 

9i u i ~r 9a tt 2 “h 9a u 3 = / 3 ( x ) J 

(verify the calculations!). The derivatives q)J, cp 3 and tp' are found 

algebraically from the last relations. Then, integrating, we obtain 
9 d 9* and 93 * 

Homogeneous linear systems with constant coefficients are especially 
important. For example, let us take the system 

y' — a a + &jZ -f c,« ' 

s' = a 2 y -j- & 2 Z + Co u (141) 

u' — a 3 y -f b 3 s + c 3 u . 

in which all the coefficients a t , 6 t , . . ., c 3 are constant. The method 
of solving such a system is similar to that of Sec. 17. Namely, non- 
zero particular solutions are sought in the form 

y = Ke vx , s = pe px , u = ve vx (142) 

where X, p, v and p are unknown constants that must he determined. 
Substituting (142) into (141), cancelling out e vx and transposing 
all the terms to the left-hand side we obtain 

(a, — p) l 4- 6 t p + CjV - 0 ’ 

a 2 k -j- (b 2 — p) p + CoV = 0 > (143) 

b 3 p + (c 3 — p) v = 0 . 

These equalities should be regarded as a system of three homo- 
geneous algebraic equations of the first degree in three unknowns 
X, p and v. For the system to have a nonzero solution (it is the only 
solution that we are interested in. according to the end of Sec. VI. 6), 
it is necessary and sufficient that the determinant of the system he 
equal to zero. Thus, 

a \ — P bi cj 

b 2 —p c 2 =0 (144) 

a 3 b 3 c 3 p 

This is the characteristic equation of system (141) from which we 
find all the possible values of p. 
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Equation (144) being an algebraic equation of the third degree 
in p (why is it so?), there are three roots p u p 2 and p 3 . If all the 
roots are simple we can substitute any of them into system (143) 
and find a nonzero solution X, p and v. Then formula (142) yields 
the corresponding solution y ( x ), z ( x ), u (x). The three particular 
solutions which correspond to p — Pi, p = p 2 and p = p 3 enable 
us to put down the general solution of system (141) in accord 
with formula (139): 

y = C i X i e v ^ C^-^e 9 - -J- C 3 X 3 e P3 1 
z = C t u t e PlX -j- C 2 \i 2 e P2X ~r C 3 \i 3 e v ' iX / (145) 

u — +C 2 V2e v ^ x + C 3 \ 3 ^ 3X J 

where C lT C 2 and C 3 are arbitrary constants. 

If equation (144) has imaginary roots we can retain the solution 
in form (145) (which will contain complex functions of x) or, by 
analogy with case 2 in Sec. 17, write the solution in the real form. 

The case when equation (144) has multiple roots is more compli- 
cated. We shall not consider this case in the general form, but in 
concrete problems one can use the following procedure. For example, 
if Pi is a double root particular solutions corresponding to p — p t 
should be looked for in the form 

l / — ()xc + X) e p > x , z — (pz + p) ePi*, 

u = (vx -T v) e vix (146) 

in place of (142). Substituting (146) into (141), cancelling out e p ‘ x 
and equating coefficients in the same powers of x we obtain equa- 
tions from which we find two different and independent sets of 
possible values of the coefficients X, X. . . ., v. This results in two 
independent solutions of system (146). If a root of the characteristic 
equation is of higher multiplicity the corresponding particular 
solution of system (141) becomes more complicated. 

Matrices are widely used in the theory of linear differential equa- 
tions to simplify the form of writing such systems (see Secs. XI. 1-2). 
To do this we usually rewrite system (138) in the form 

y[ = «n (z) yi + <*i 2 (*) + <*i 3 te) y s ] 

y\ = a 21 (x) y t + a 22 (x) y 2 + a 23 {x) y 3 i (147) 

y' 3 = <*31 (z) </l + <*32 (*) */2 + <*33 ( X ) y 3 J 

and introduce the coefficient matrix 

( an (x) a i2 (x) a, 3 (x) \ 

a 22 (x) a 23 (x) j 

<* 3 l ('*') <*32 (•< ) <*33 (x) J 
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and the vector solution 

/,. 

y= I & 

V //.3 

which is a one-column matrix. For our further aims it should be 
noted that if we are given a matrix B ( x ) = (a:)) the rules of 

performing linear operations on matrices (see the beginning of 

Sec. XI.2) imply that ^ and hence B' ( x ) = (fr'ij ( x )). 

This means that to differentiate a matrix we must differentiate all 
its elements. It is also easy to verify that all the basic rules of diffe- 
rentiation (such as the formulas for the derivative of a sum or of 
a product) remain true for matrices. It follows that 

y '=pj 

\y 3 

and therefore, after a manner of (XI. 9), system (147) can be written 
in the matrix form 

y' = A ( x ) y (148) 

Accordingly, non-homogeneous linear system (140) can be put down 
in an analogous form 

(U{x)\ 

y' = A (x) y + f (.r) , (f (a:) = h {%) ) 

\U{x)! 

Solution (139) can be revTitten in the vector form 

y = Ciy 1 + c„ y 2 + c 3 y> 

where the indices designate the numbers of the corresponding 
linearly independent particular vector solutions of equation (148). 
System (141) turns into 


y' = Ay 

and its solution (142) is of the form 


y = e vx a 


where 


is a constant vector. Substituting (150) into (149) we get 
pe TJX a = Ae vx a,, that is Aa = pa 


(149) 

(150) 
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We see (compare with Sec. XI.4) that a and p must he, respectively, 
an eigenvector and the corresponding eigenvalue of the matrix A. 
As it was shown in Sec. XI.4, an eigenvalue is found from the equa- 
tion 

det (A — pi) = 0 

which is nothing but equation (144). 

In solving systems of form (141) with constant coefficients (and 
also the corresponding non-homogeneous systems) we can apply 
the operator method (see Sec. 20). To do this we write y' — D y, z' = 
= Dz and u' — Du and then solve the resulting system of equations 
as an algebraic system in the unknowns y. z and u according to the 
methods of Sec. VIA. But when doing this we stop performing 
algebraic operations when we arrive at formulas of type (VI. 13) 
which are of the form P (D) y = / {x) in our case. Then we use 
the methods of Sec. 20 and thus complete the process of solving 
the system. The procedure described here is nothing but another 
form of the method of reducing a first-order system to one equation 
of higher order. 

22. Applications to Testing Lyapunov Stability of Equilibrium 
State. We understand the stability of an object, a state or a process 
as its ability to oppose external actions. It is often impossible to 
take some of these external factors into account beforehand. The 
concept of stability emerged as early as antiquity and now it plays 
an important role in physics and engineering. There are different 
concrete realizations of this general notion depending on the type 
of the object under consideration, on the character of external effects 
and so on. One of the realizations was discussed in our course in 
Sec. 16. Here we are going to consider the notion of the Lyapunov 
stability' which is one of the most important forms of stability 1 . It was 
introduced by' the prominent Russian mathematician A. M. Lyapu- 
nov (1857-1918) in 1892. 

Let the state of an object be described by' means of a finite number 
of parameters. For definiteness, suppose that there are three such 
parameters x, y, z. Then a process in which the object changes as 
time passes is determined by' three functions x — x (t), y — y ((), 
z = z (t) (where t is time). Let the law of the process lie described 
by a system of differential equations of the form 

^; = P(x. y, z ) 

-§r = <?(*• //, z) 

^=B(x. y, z) 


( 151 ) 
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where the right-hand sides which are regarded as being known do 
not contain the independent variable t explicitly. The last condi- 
tion means that the character of the differential law of develop- 
ment of the process does not change as time passes. 

Suppose that an equilibrium state of the object, that is a state 
in which it does not change in time, is described by certain con- 
stant values x = x 0 , y — y 0 , z — z 0 . Then these constants, regarded 
as functions of the time must satisfy system (151). The direct sub- 
stitution of x 0 , 7 / 0 , z ? into (151) implies that for this to be so it is 
necessary and sufficient that the relations 

P (x 0 , y 0 n Zo) = 0) Q {Xqi I/O, Zo) 

R (x 0 , I/O- Zo) ~ 0 (1^2) 

should hold simultaneously. 

Suppose that at an instant t 0 the object is shifted from the equi- 
librium position, that is the parameters become equal to x = x 0 + 
+ Ax 0 , y = i/o -r Ai/ 0 and z = z 0 + A z 0 . To investigate tbe further 
changes of the state of the object we must solve system (151) with 
the initial conditions 

, x {t 0 ) — x Q + Ax 0 , y {t 0 ) = i/o + Aj/„, 

z {to) — Zo + Azo (153) 

The equilibrium state in question is said to be stable (Lyapunov 
stable) if after any infinitesimal displacement from the state the 
object remains all the time in an infinitesimal vicinity of the equi- 
librium state. In other words, the differences 

Ax = x {t) — x 0 , Ay — y (t) — y 0 , Az = z {t) — z 0 

corresponding to the solutions of system (151) with initial condi- 
tions (153), for infinitesimal Aa; 0 . A y 0 , Az 0 , must be infinitesimal 
over the whole time interval t 0 ^ t < oo. 

At first sight one may find it strange that we consider infinitesimal 
deviations of the parameters and an infinite time interval because, 
practically, all the quantities are finite in reality. But here it is 
advisable to recall the difference between the mathematical and 
practical infinities (see Sec. III. 1). A practical infinitesimal quan- 
tity is a real quantity which is small relative to the scale of the 
process. in question. Similarly, a practically infinite time interval 
is the interval of a transient process, that is a process of passing 
from the state under consideration to a state of a different type 
(for instance, the transition from a state of equilibrium to another 
state of this type or from a state of equilibrium to the collapse of 
t e object and so on). Hence, in reality the Lyapunov stability 
means that any small deviation from the equilibrium state does 
not practically change the state. 
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To test whether there is stability we substitute x — x 0 + Ax, 
y — y a -j- Ay and z = z 0 -f- Az into system (151) which results in 

^- = P(x 0 + Ax, y„4-Ay, z 0 + Az) = 

= (P'x) o Ax + (P' y ) 0 Ay + (P' z ) 0 Az -f- . . . 

= <? (x 0 4- Ax, yo-f-Ay, z 0 + Az)= ^ 

= (<?i) 0 Ax + (<3') 0 Ay -f (Q' z )o Az + . . . 

— = i? (x 0 -r Ax, y 0 -J- Ay, Zqt Az) = 

= (,P x)o Ax -f- (fiy) o Ay -j- (i? 2 )o Az 

where the notation (P’ x ) 0 = P* (x 0 , y 0 . z 0 ) etc. is introduced. In 
transforming the right-hand sides we have used Taylor’s formula 
(XII. 17) and formulas (152). Here the dots designate the terms 
of the order of smallness higher than the first. 

When investigating the stability we consider only small values 
of Ax, Ay, Az and therefore the most important role in the right- 
hand sides of system (154) belongs to the linear terms that are put 
down there. Therefore we replace system (154) by a truncated system 
(a system of the first approximation) which is obtained by dropping 
the terms of higher order of smallness and which has the form 


= (Px)o Ax -f- (Py)o Ay -f- (P’z)o Az ' 
= Ax + (<?;)„ Ay +(<?; )o Az - 

= (R*)o A* + (K)o Ay + (R' z ) o Az 


(155) 


System (155) is a linear system with constant coefficients which 
can be solved by the method of Sec. 21. According to formula (145) 
(in which, of course, we had different notation) the general solution 
of system (155) is a linear combination of functions of the form 
e pt where p satisfies the characteristic equation 


(P’ X )o~P (Py)o 

«?»)o ( Qv)a P 

(Rx)o (Ry)o 


(P'zh 

(R z )o—p 


= 0 


(156) 


To small Ax 0 , A y 0 , Az 0 there correspond small values of the ar- 
bitrary constants C u C 2 , C 3 . Therefore the behaviour of a solution, 
as t-+ oo, is completely determined by the behaviour of the functions 
e vi . If p = r -(- is (the case s = 0 is not excluded here) we have 
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| e vt | — e n [see formula (VIII. 13)] and hence 

\e pt |i-voo — » 0 for r<0 and | e pt !,_><*, -» oo forr>0 (157) 

Consequently, we arrive at the following conclusions: if all the 
roots of characteristic equation (156) have negative real parts (in 
particular, they may be negative real numbers) the state of equilib- 
rium x 0 , ijo, z 0 is stable. Besides, in this case we have x ( t ) z 0 , 

y (t) ->• Vo and z W z o for 'small Ax 0 , A y 0 , Az 0 , as t oo. In such 
a case we say that the equilibrium state is asymptotically stable. 
But if there is at least one root with a positive real part the state 
of equilibrium is unstable. 

We have derived these results from system (155) but according 
to the above considerations the same assertions are true for the 
complete system, i.e. system (154). It should be noted that our 
conclusions also remain true for the case of multiple roots of equation 
(156) although then we can have powers of t as factors entering 
into the solution. In fact, exponential function (157) approaching 
zero, for r < 0, faster than any negative power of t , the above asser- 
tion appears evident. 

Both conclusions obtained above do not include the case when 
fhere are no roots with positive real parts among the roots of equa- 
tion (156) but there is at least one root having a zero real part. In 
this case there appear functions of the form 

e ,sl = cos si + i sin st (|e ,s( | — 1) 

in the general solution of system (155). This implies that the object 
is likely to oscillate about the equilibrium position or to remain 
motionless near this position without approaching it. But the time 
interval being infinite, the terms of higher order of smallness that 
have been dropped begin to influence the process noticeably, and 
this can violate the stability. It is therefore impossible to arrive 
at any conclusions on the stability or instability of the state of 
equilibrium in this special case judging by the roots of equation (156). 
To investigate the stability in such a case we must involve some 
additional considerations; for instance, we can try to consider sub- 
sequent terms in expansions (154). 

In the case when the changes in our object are described by means 
of one function x ( t ) satisfying the differential equation 

-§- = /(*) (158) 

the above results are especially simple. We see that if / (x 0 ) = 0 
and / (x 0 ) < 0 the value x = x 0 corresponds to a stable equilibrium 
state, and if / (x 0 ) = 0 and f ( x 0 ) > 0 the state is unstable. [We 

6-0141 

3 
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suggest that the reader should draw this conclusion on the basis 
of the disposition of the isoclines of equation (158) in the t, x-plane.] 

There are many books on the theory of stability. We refer the 
reader to a comprehensive course [31]. 

§ 7. Approximate and Numerical Methods of 
Solving Differential Equations 

We often cannot integrate a differential equation exactly by re- 
ducing it to quadratures. Then we should apply other methods for 
constructing a solution. We have already described the simplest 
graphical method for solving first-order equations (see Sec. 3). 
Here we are going to present some methods for constructing appro- 
ximate formulas of solutions which are analogous to the methods 
of solving finite equations described in § V.l. We shall also discuss 
some numerical methods of solving differential equations which 
yield a sought-for particular solution represented in a tabular form. 
For the sake of simplicity, we shall consider first-order equations 
but the same methods can naturally be extended to equations of 
an arbitrary order and to systems of equations. 

23. Iterative Method. Let us take a first-order differential equa- 
tion with a given initial condition of the form 

y' = f (*> i/). y (*o) = y 0 (159) 

Taking integrals of both sides of the equation we obtain 

X XX 

j IJ dx = y — y 0 = j / (x, y) dx = j / {x, y (x)) dx 

\Q x 0 Xo 

Changing the notation for the variable of integration we write 

y (z) = l/o -r- j / (s, y (*)) ds (160) 

*0 

Equation (160) is equivalent to both equalities (159) because if 
we differentiate it we obtain the first equality and if we substitute 
x = x 0 in it we obtain the second one. Equation (160) is an integral 
equation since the unknown function appears under the sign of 
integration in the equation. 

The form of equation (160) is convenient for applying the iterative 
method to it [compare with equation (V.9)] although here we have 
an unknown quantity that is a function but not a number. Now 
we must choose a certain function y 0 (x) as a zero approximation 
(initial approximation). It is desirable to choose it so that it should 
be as close as possible to the sought-for solution. If we have no 
information about the solution we can simply put y 0 (x) == y 0 . The 
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first approximation is tlien found by tbe formula 

X 

Vi {x) = y 0 -r ) f(s, y 0 (s))ds 

-I’o 

Substituting tbe result into the right-hand side of (160) we obtain 
the second approximation and so on. Generally, 


yn+l(*) = 0O+ j f(s, Un(s))ds (ll = 0. 1, 2. . . .) (161) 


*0 


As in Sec. V.3, if the iterative process converges, that is if the succes- 
sive approximations tend to a certain limiting function, as n increa- 
ses, this function satisfies equation (160). 

To verify this we can pass to the limit in 
equality (161) as n-*- oo. 

It is remarkable that the iterative meth- 
od applied to equation (160) usually con- 
verges for all x which are close enough 
to x 0 . At any rate, this is so if the condi- 
tions of Cauchy’s theorem (see Sec. 3) are 
fulfilled. The convergence is due to the fact 
that in calculating subsequent approxima- 
tions we integrate the preceding ones and 
that successive integrations “smooth” the 
function and gradually eliminate various 
irregularities which have been brought in 
by the choice of the zero approximation, 
by the round-off errors and the like. In 
contrast to it, differentiation of a function, 
as a rule, “worsens” the function and increa- 
ses the initial irregularities. Therefore an 
iterative method based on successive diffe- 
rentiations is likely to diverge. 

The difference between integration and 
differentiation is illustrated in Fig. 298. 

There is a disturbance of the function depi- 
cted in the upper part of Fig. 298. We see 
that this essentially changes the derivative of the function which 
is shown below (what is the form of the graph of the second 
derivative?). At the same time we see that the integral is changed 
very slightly. 6 

For example, let us consider a particular case of Riccati’s enua- 
tion (see the end of Sec. 4) of the form 4 

V' =x 2 + y- 



Fig. 298 


36* 
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•with the initial condition y (0) = 0. After integrating we have 


y(x) = ^- + j if (s) ds 
0 


We have no information about the desired solution and therefore 
we choose the zero function y 0 (z) = 0 as a zero approximation 
which, at any rate, satisfies the initial condition. Then we obtain 
(verify the calculations!) 


yi(*)“-T' (4r dx = ir+i’ 


. x 3 X? 2x11 *15 

Vs ( x )— 3 -r 63 -r 2079 ‘59535 


and so on. We see that for small x (for instance, for | x | < 1) the 
process converges fast. Really, we can put y — y + 03 for | rc 1 <1 

j 

with an accuracy of 0.001, and for | re | <1 -g- the same accuracy 

X 3 

is attained if we simply set y— ■ 

The question as to what is the approximation at which we should 
stop the iterative process is answered by comparing subsequent 
approximations with the preceding ones. 

24. Application of Taylor’s Series. Differentiating equation (159) 
and using the initial condition we can find the values y' ( x 0 ), y" ( x 0 ) 
etc. Therefore we can form an expansion of the solution into Taylor’s 
series (see Sec. IV. 16). The necessary number of terms that guaran- 
tees a chosen degree of accuracy is found by successive calculation 
of the terms and their comparison with the degree of accuracy. 
For instance, let us take the problem 

y' = x 2 -j- y 2 , y ( 0 ) = 1 

Substituting x = 0 into the right-hand side of the equation we find 
f (0) = 0 2 + l 2 = 1 

Differentiating both sides of the equation we obtain y" — 2 x -{- 
4- 2 yy and then, substituting x — 0, we derive 

y" (0) = 2-0 + 2-1 -1 = 2 

We likewise find 

if = 2 + 2 y’* + 2 yy\ f (0) = 8, y™ = 6 y’y" + 2 yf , 
yiv (0) = 28 
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and so on. Substituting the results into Maclaurin’s formula (IV. 54) 
we obtain 


„ , ^( 0 ) - 1 »"( 0)^2 ; 
y — y (*-'/ i , i x i oi x i 


1! ■*' 1 2! 


The formula can be used for small values of | x |. 

25. Application of Power Series with Undetermined Coefficients. 
This method is closely related to Sec. 24. According to the method 
we seek a solution of a given equation in the form of a series with, 
unknown coefficients of the form . 


y — a + b (x — x 0 ) -j- c (x — x 0 ) z -f- d (x — x 0 ) 3 + • • • (162) 

The coefficients are found after we substitute the series into the 
equation and equate the coefficients of the same powers of x (and use 
the initial conditions if there are any). 

For example, let us consider Airy’s equation 


if —xy = 0 (163) 

We shall look for a solution in the form of an expansion into powers 
of x: 

y — a 0 + ayX + a 2 x 2 + a 3 z? + . - - (164) 

After (164) is differentiated and substituted into the equation 
we obtain 

(1 -2a 2 -[- 2-3a a x -\- 3-4a 4 x 2 -j- • • •) — x (a 0 + ajX + + ...)=0 

Equating coefficients in equal powers of x we derive 

1 *2fl 2 = 0, 2-3a s — a 0 = 0, 3-4a 4 — aj =0, 

4-5a 5 — a„ = 0, 5-6o G — a 3 = 0, ... 

from which we find, in succession, 


a 2 - 0, a 3 = 

. a 3 __ gp 

5-6 2 - 3 - 5-6 ’ 


_E°. a __5L 
2-3’ 4 ~ 3 . 4 ’ 


a 7 — 


6-7 


°o 

3 - 4 - 6-7 ’ 


n . — A§. g 0 

9 8-9 2 - 3 - 5 - 6 - 8-9 


^ = 0, 

4-o 


a 5 


o, 


and so on. 
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The substitution of these results into formula (164) yields the 
general solution of equation (163): 




a i „7 i 

- X 1 — 


3-4 * 
*0 


X s - 


3-4-6-T 


2 . 3 - 5 - 6 - 8-9 


= a 0 ( 

-Mi (x-r£ 




2-3 1 2 - 3 - 5- 6 1 2 - 3-5 

4 x 7 , *10 


2 . 3 . 5. 6 ‘ 
x 9 +...= 

• 6 . 8-9 + * • •) + 


4 1 3 - 4 - 6-7 ' 3 - 4 - 6 . 7 . 9.10 


The constants a 0 and a { are left undetermined and play the role 
of arbitrary constants which were previously designated by C, and 
C 2 in Sec. 14 (see property 5 in the section). The series which are 
taken inside the parentheses are two linearly independent particular 

solutions of equation (163). In particular, putting a 0 9T J 1 

and a t = — [^3T (^)] 1 we obtain the so-called Airy's function 

of the first kind which is denoted as Ai ( x ). 

The above technique is always applicable to linear equations 
of the form 

®o (z) y w + a-i (z) p (li_1) + . . . + a n (x) y = f (x) (165) 

in case all the functions a 0 (a:), a t (x), . . ., / {x) are polynomials in 
x or, in a more general case, when they are sums of power series 
in powers of x — x 0 , and a 0 (x 0 ) =^= 0. 

If a 0 ( x 0 ) — 0 the value x 0 is said to be a singular point of equa- 
tion (165). Then there may he no solution of form (162). In this 
case it is sometimes possible to find a solution in the form 

y — {x — x 0 )P [a + b {x — x 0 ) + c (x — x 0 ) 2 ~f 

+ d (x — x 0 ) 3 + . . .] (166) 

where the constant p should also be^chosen. When selecting a solu- 
tion of this form we can regard a as being unequal to zero because 
otherwise we can take a certain power of a; — x 0 outside the brackets 
and thus the operation reduces to a change of p in its value. 

26. Bessel’s Functions. We how consider an important example 
of the so-called Bessel's equation of the form 

x 2 y" + xy' + [x 2 — p 2 ) y = 0 (167) 

where p = const 0 and 0 < x <; 00 . Solutions of the equation 
are called Bessel’s functions although in fact they were used by 
Euler beginning with 1766, that is before Bessel was born. The 
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functions are also called cylindrical functions because 
they are widely applied to solving equations of mathematical phy- 
sics in a domain of the form of a circular cylinder. 

Since the point x = 0 is a singular point of equation (167) its 
solution should be sought for according to formula (166) in which 
we must set x 0 = 0: 

y = ax p + bx P+ 1 -f cx p + 2 + dx p + 3 4- 
+ e.rP+ 4 4 - /.tp + 5 4- gx p + & -f- ix^ 4- /2 P+S 4~ • • • (168) 

The differentiation of (168) and the substitution into (167) result in 

x- [ap (p — l)ze- 2 4-&(p4-l)pz p ~ 1 'hc(p4-2) (p 4~ 1) ^ p 4” •••]-!- 
-{-x[apx p ~ l 4-b (p4- 1) ^ p 4~ c (P + 2) 4- •• .]4- 

4- x 2 (ax p 4- bxP+ l 4- cx p + 2 4- . . . ) — p 2 (ax p 4~ bx p + l 4- cx p + 2 4- . . -)~0 

After the coefficients in the same powers of x are equalled we 
obtain an infinite number of equalities which are then solved in 
succession: | 


ap (P — 1) 4~uP — ap- = 0 yields a(p-~-p*) — 0, 
fi(P + l)P-r & (P + 1 ) — ^P 2 '=0 yields b (p 2 4-2p-{- 1 — p 2 ) = 0, 
c(p4-2) (p 4-1)4- c(p 4-2)4- a — cp 2 — 0 yields 
c (p 2 4- 4p 4- 4 — p 2 ) 4- a = 0 , 
d (p 4- 3) (p 4- 2) 4- d (p 4- 3) 4- b — dp 2 — 0 yields 
d (p 2 4- 6p 4- 9 — /? 2 ) 4- b = 0, 
e (p 4~ d) (p 4- 3) 4* e (p 4~ 4) 4 - c — ep" — 0 yields 
e (p 2 4- Sp 4- 16 — p 2 ) 4~ c = 0 


and so on.' Since a=^=0 we see that, by the first equality, we 
have p 2 = p 2 , i.e. p =±p. Substituting the result into the remai- 
ning equalities we receive, in succession, 


4p 4-4 2 2 (p 4 — 1) ’ 

c a 

8P4-16 ~24.2(p + l)(p-f-2) ’ 

a 

20.2.3 (p + l)(p + 2)(p4-3) * 


d = 0, 

« 

/= 0 , 
i = 0, 


7 ^•2-3.4(p4-a ) tp-h2) (P4-3KP4-4) etc> 
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From this, on jthe basis of formula (168), we find a solution 
of the form 


y = ax p ■ 


2®(P+1) 


x' 


P+2 


2«.3!(p+l)(p + 2)(p + d). 


2«-2!(p + l)(p + 2) 

x p+e _j_ . . 


# p + 4 — 


(169) 


where a is an arbitrary constant. Here it is convenient to put 
a — 1 (see Sec. XIV. 17). By formula (XIV. 66), we have 

2 p r(p-M) 

r(p+i)(P+i) = r( P +2), r(p+i)(p+i)(p+2) = 

= F (p -f- 2) (p -f- 2) = r (p -}- 3) etc. 
and therefore formula (169) implies, for the above a , that 

^ r(p-f-i) \2j nr (p+2) V 2 / + 

, 1 I x \ P+4 1 lx \ P+6 , 

1 2!r(p+-3) \ 2 / 3!r(p + 4) \ 2 / 1 


03 




(-i) n 

n!r(Pi-n+ 1) 



(170) 


This sum is called Bessel’s function of the first kind of order p 
and is denoted as J p (x). Since p = ±p the general solution of equa- 
tion (167) (see property 5 in Sec. 14) can be written as 

y = C t J p (x) + C 2 J_ p (x) (171) 

Formula of the general solution (171) does not apply for integer 
p — 0, 1, 2, 3, .... Actually, for such p and p = — p we have 

r ( — P — l) = r(— p-Y 2) = ... = r (—p + p) = ±oo 
and therefore formula (170) results in 


oo 


j -p ( x ) = 


J n! ( — p-f-n)! 
n=p 


ay™ 


X 1 (-l)P(-l) n ' (x\P + 2n ' 
£ (P -+»')! («')! \ 2 / 


(-iyj P (x) 


(P = o, 1,2, ...) 


(we have made the substitution n — p = n'). Consequently, in 
this case the solutions J p (x) and J_ p {x) are linearly dependent 
(see Sec. 14), and formula (171) does not therefore express the general 
solution (although it yields a family of particular solutions depen- 
ding on one essential parameter). 
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To obtain a formula for the general solution of equation (167) 
valid for all p we can apply the operation similar to the one used 
in Sec. 17 (see case 3). Namely, we first suppose that p is non-integral, 
and form the function 

1 cos pnJp (z) — J- p (x) 

Y P (x) = cot pnJ p (x) - giD p ~ J.p (x) 

The last function being a linear combination of the solutions, we 
obtain a solution of equation (167) which is called Bessel’s function 
of the second kind of order jj. It is also sometimes designated as 
N p ( x ). Now, if p becomes an integer, there appears an indetermi- 
nate form on the right-hand side (why?). It can be calculated accor- 
ding to L’Hospital’s rule but we shall not put down the calculations 
here. We only note that the result will be a sum in which the expres- 
sion 

- (for P = 1,2,3,...) or -J-ln* 

(for p = 0) 

will be the principal term as x 0. 

Thus, the formula 

V = CJp (x) + CnYp (x) (172) 

represents the general solution of equation (167) for all p ^ 0 (both 
integral and non-integral) in the interval 0 <C x <Z oa. The value 
J p (+0) is finite here whereas Y p (+0) = — oo. Therefore if the 
conditions of a problem imply that y (+0) should be finite we must 
retain only the first summand on the right-hand side of formula (172). 

Bessel’s functions have been thoroughly studied, and there exist 
extensive tables for the functions. The functions 


J 0 (x) = 1 
Ji (*)=■! 


X Z x 4 X 6 , 

(1 !) 2 2 2 1 (2!)2 2 * (3!) 2 2® + * 

X 3 xp 

1 ! 2 ! 2 3 1 2 ! 3 ! 2 5 3 ! 4 ! 2 ? ‘ ’ - 


) 


(173) 


are the most important for applications. The graphs of these functions 
are approximately represented in Fig. 299. These functions and 
all Bessel’s functions of the first and of the second kind change their 
signs infinitely many times and tend to zero as x increases. From 

formulas (173) we can easily deduce the relationship /o {x) — —J l (x) 

(check it up!). There are some other relationships between Bessel’s 
functions. All these properties can be found in [29], 

27. Small Parameter Method. This method was described in Sec. 
V.5. It can also be applied to solving differential equations. Here 
we present some simple examples. 
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The problem 

y'= r+fcr- ^°)=° < 174 ) 

does not involve any parameters. But we can take a more general 
problem of the form 

rpsr- »<®-° < 175 ) 

where a is a small parameter. Problem (174) is a particular case 
of (175) for a = 0.1. Problem (175) can be easily solved for a — 0. 



Evidently, in this case we have y = y . Therefore we seek the 

solution of problem (175) as an expansion in powers of a, that is in 
the form 

y = ^- J r au~a 2 v-> r a 3 iv J r ... (176) 


where the functions u = u (x), v = v ( x ) etc. depending on x are 
yet unknown. 

Substituting (176) into (175) and multiplying by the denominator 
we obtain 


(x -f-ccu' -f- arv' -f- o?w' -f- . . . ) ( 1 -j- — x 3 + a -xu -f a 3 xv 4- • • . j 


= x 


The initial condition implies 


(177) 


au (0) -f* a?v (0) -+- . . . — 0 
and therefore we have 

u (0) = 0. v (0) =0, io (0) = 0, . . . 


(178) 
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Removing tlie parentheses in (177) and equating the coefficients in 
power of cc to zero we get, in succession, 


n'-J-y x 4 = 0, 


xhi = 0, 


w' -{ — 2 " v' ~f* xuu' -j- %~ v — 0 

and so on. Now taking into account equalities (178) we find 

*5 




10 


„ L - 7*8 ,,, — JL jii 

y_ i 60 x ’ w ~nm 


etc. (check up the calculations’). 
Consequently, formula (176) yields 


y=~r— io x 


7a 2 


-X s — 


71 a 3 


x n + 


160 * 1760 

In particular, putting a = 0.1. we obtain the expression 


/x° 


7lx 11 


y=- 


100 


16,000 1,760,000 


for the solution of equation (174). This series perfectly converges 
for | x | < 1. the convergence being a little slower for 1 < | x \ < 2. 
As another example, let us take the problem 

1 / = sin (xy), y (0) = a (179) 


Here, in (contrast to the previous problem, there is a parameter 
entering into the initial condition. Problem (179) has an apparent 
solution for a — 0, namely the solution y = 0. Therefore w r e look 
for the solution of the form 


y — au arv a?w . (u — u{x), v = v(x),...) (180) 

for small | a |. The substitution of the value x — 0 yields 

u (0) = 1, v (0) = 0, w (0) = 0, ... (181) 

On the other hand, substituting (180) into differential equation 
(179) and taking into account the power series for the sine [see for- 
mula (IV. 57)] we receive 

au' -{- aV -f a 3 w' -f- = 

(cxi«-j-a 2 .T£j-|-a 3 .ruj-f- . . .) (axu-\-a~zv-\-a 3 xw -\- . . ,) 3 , 

— 3j r • • • 

Equating the coefficients of the same powers of a we get 

r , , X 3 U 3 

u — xu , v —xv, w = xw 2 |— , ... 

Hence, here we have arrived at linear equations which must be 
solved in succession. Integrating the equations with initial con- 
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ditions (181) we find 

u = e 2 , i7 = 0, w — -jg (1 — x 2 ) e 2 —ye-,... 

(check up the results!). Substituting the above expressions into 
(180) we obtain the sought-for solution in the form of an expansion 
which is valid for small | x | and | a j. 

In more complicated problems involving the small parameter 
method it often turns out that even the determination of the first 
term containing the parameter may yield fruitful results. 

28. General Remarks on Dependence of Solutions on Parameters. 
Here we give some general considerations related to the problems 
discussed in the preceding section. There are many cases when a 
differential equation in question or a system of such equations 
contains one or more parameters which can take on different con- 
stant values. For simplicity’s sake, let us consider a first-order 
equation of the form 

%-=f(x,y,X) (182) 

where X is a parameter. Suppose we have certain initial conditions 
x = x 0 , y = y 0 . 

Let us suppose that the point ( x 0 , y 0 ) is not a singular point (see 
Sec. 7). This means that there is a single solution of equation (182) 
satisfying the initial conditions. Then the geometric meaning of 
equation (182) (see Sec. 3) implies that if its right-hand side is 
continuous in X the change of the direction field corresponding to 
small variations of X will also be small. Therefore the solution 
y (x, X) will also be continuous in X. We can likewise conclude that 
the situation will be the same if not only the right-hand side of 
the equation continuously depends on X but the initial conditions 
as well, that is if x 0 — x 0 (X) and y 0 — y 0 (X). 

Suppose that the solution y (x, X) of equation (182) is known for 
a certain value X (the “non-perturbed” value of the parameter, as 
it is called). Now, let the value of the parameter change and become 
equal to X -f- AX. where | AX | is small. Then y will also change and 
gain an increment A 7 y. The differential of A ).y, that is its principal 
linear part, will be called a variation of the solution and will be 
designated as dy. 

Thus, a variation is a particular differential of a solution corres- 
ponding to an increment of the parameter. The new notation has 
been introduced to distinguish between the differentials correspon- 
ding to the argument x and to the parameter X. When it is permissible 
to neglect infinitesimals of higher order of smallness we can simply 
say that the variation of a solution is an infinitesimal change of the 
solution due to an infinitesimal change of the parameter. With a 
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value of X chosen, the quantity 6 y (as well as y) depends on x. Hence 
we can write 6y = Sy (x) where Sy (x) also depends on AX and is 
directly proportional to it (although the dependence on AX is not 
indicated explicitly here). 

To form a differential equation for 6 y we must equate the diffe- 
rentials (taken with respect to ^,) of both sides of equality (182): 

8^=8(f(x,y,X)) = ^8y-\-^8X (8X = AX) 

that is 

= V, X)8y-\-f k (x, y, X)8X (183) 

In deducing (183) we have interchanged the signs d and 8 because 
these are the signs of the differentials taken with respect to diffe- 
rent variables (see Sec. IX. 15). We have also applied the formula 
for differentiation of a composite function. Equation (183) is called 
the variational equation corresponding to original equation (182). 
Since the “non-perturbed” solution y ( x , X) is substituted for y 
in the right-hand side of (183) the equation is linear in 8y and can 
be easily integrated (see Sec. 4). Variational equations which cor- 
respond to higher-order equations or to systems of equations are 
not integrahle by quadratures in the general case but they are 
always linear. 

Let us establish the initial condition for 8y. In the general case, 
when Xq = x 0 (X) and y 0 = jj 0 ( X ), we substitute X + 8X for X and 
neglect infinitesimals of higher order which implies that we have 
y = Vo (h + 8X) = I/O + y'^X for X = x Q {X + 8X) = x 0 -f x' 0 8X 

where x' = and y' — ■ Hence, for the value X + SX 

of the parameter and for x = x 0 , we have 

y *o y I x=xo~1-xq61 ld.v=a-j 6 X 

= yo + y’ 0 8X — ^ x' 0 8X ~ y 0 + y' 0 8X — f 0 x’ 0 8X 

where / o = / (^o» y oi X). But the same value of y is equal to 
(y (x, ^) + Sy) | x =x 0 — y o + {8y) l.-c=x 0 - Therefore, the initial value 
of Sy is expressed by the following initial condition: 

(5y) \x=x a = (y' 0 -fox' 0 ) 8X 

In Bie particular case of values x 0 and y 0 independent of X we have 
fg ~ y» = 0 an d therefore the initial condition has the form 

%) \x=x 0 = 0 . 

There are some cases when a parameter enters into a differential 
■equation in such a way that the order of the equation reduces, that 
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is the equation degenerates, for certain values of the parameter. 
Such a situation is connected with new phenomena which we shall 
illustrate by taking an example. 

Consider the problem 

Xy' + 1) — 0) y I = i (184) 


whose solution is y = e . The value X = 0 yields the degeneration 
(why?). Let the solution he considered for x 0 and let X-*- +0; 
the solution is depicted in Fig. 300. The limiting case for equation 
(184), as X-*- -f0, is the equation y = 0. This is a finite equation 
whose solution y = 0 does not satisfy the initial condition p| I= . Vo = 
= 1. Besides, we see that when X is close to zero (but unequal to 

zero) the solution is close to zero for 
the values of x which are not too close 
to x — 0 (for instance, for the values 
of x exceeding the value h shown in 
Fig. 300), this not being the case for 
the values of x which are close to 
x — 0 (for instance, for the values 
of x lying in the interval 0 < x < h). 
An interval of the type 0 < x < h 
(which we have in our particular 
problem) is called a boundary layer. 

The solution of (184) must cross the 
boundary layer 0 < x < h in order to pass from the unit initial 
value (184) to a value which is close to zero. 

The width of a boundary layer is understood conditionally. In 
fact, in our example the solution never turns into zero. If we con- 
ditionally take a certain value of x as the width of the boundary 
layer, for instance, x = h for which the magnitude of the solution 
decreases 10 times in comparison with the initial magnitude, then, 
in the case of problem (184), we obtain 



h_ 

’>■ = 0.1 


and h — lnlQ-X 


that is the width of the boundary layer is directly proportional 
to the value of X. 

If X — * 0 we obtain the solution which is shown in the dotted 

line in Fig. 300. This solution tends to infinity for any x > 0 as 
X — *- —0. This case is not as interesting as the preceding one. 

An analogous phenomenon is often encountered in more complica- 
ted cases. For instance, let the solution of a second-order equation 
satisfying two initial conditions or boundary conditions (see Sec. 16) 
he considered and let the order of the equation be reduced by unity 
for a certain value X = X 0 . Then the solution y 0 ( x ) corresponding 
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to X = X 0 satisfies a first-order equation. Therefore it often happens 
that if the solution is finite for X — X 0 it satisfies only one of the 
initial conditions and may riot satisfy the other. The solution z/ ( x ) 
has a boundary layer for the values of % close to (with the width 
proportional to | X — A. 0 |) which is transversed by the solution 
-when it passes from the other condition to y 0 {x). A similar situation 
may also occur for systems of differential equations. 

29. Methods of Minimizing Discrepancy. The methods are based 
on the idea that an unknown function is sought in a form containing 
several parameters, that is in the form 

y = <p (x, Xi, X 2 , . . ., X m ) (185) 

The right-hand side is usually chosen in such a way that the initial 
or boundary conditions imposed on a solution should be satisfied 
for any values of the parameters. After expression (185) has been 
substituted into a given differential equation we obtain the corres- 
ponding discrepancy, that is the difference between the right-hand 
and left-hand sides. The discrepancy (which we denote by h) depends 
upon the parameters: 

Jl — }i (:C, X 2 , . . ., Aiflj) 

If solution (185) were exact the discrepancy h would equal zero 
identically. Therefore we can impose m . additional conditions that 
should be satisfied by the discrepancy and that are automatically 
fulfilled for the function which is identically equal to zero in order 
to determine the necessary values of the parameters X { , X 2 , . . X m . 
For instance, we can equate h to zero for m different values of x\ • 
this is the collocation method. We can try to minimize the integral 

b 

J* h~ dx on the interval a ^ x ^ b in which the solution is being 
constructed; this is the method of least squares. We can equate to 

b b 

zero m integrals of the form J h\^ ( x ) dx , $ h\\> 2 {x) dx, ... 

a a 

b 

.... (x) dx where y\ h (x), y 2 {z),...,y m (x) is a chosen system 

of functions; this is the method of moments (integrals of this type 
are called moments ). 

£ lea ^ er ^ ie num ber of the parameters we introduce, the more 
flexible formula (185) (that is the greater the accuracy of the repre- 
sentation of a sought-for solution which can be attained by means 
0 the formula). But at the same time every increase of the number 
of the parameters leads to more and still more complicated calcu- 
lations. It is an art to be able to forecast the form of the sought-for 
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solution by using a formula containing a few parameters. The correct- 
ness of the result can be judged by comparing the results of repeated 
calculations performed according to diSerent methods or with the 
help of different numbers of parameters etc. 

We can also take a right-hand side of formula (185) which iden- 
tically satisfies only a number of initial or boundary conditions 
imposed on the solution. Then the corresponding number of con- 
ditions chosen for determining the values of the parameters must 
imply that the remaining conditions should be satisfied. It is appa- 
rent that in such a case the number of additional conditions that 
can be chosen more or less arbitrarily is respectively reduced. 

Let us take an example in which it is possible to compare appro- 
ximate solutions with the exact solution. Let it be necessary to 
solve the problem 

y" + y = 0 (0 < x < 1), y (0) =0, y (1) = 1 
Let us look for the solution of the form 

y = Xx -f- fix 2 (186) 

Here the first boundary condition is satisfied automatically whereas 
the second implies X + p, = 1. From this we obtain y = Xx + 
+ (1 — X) x- and hence we have only one degree of freedom at our 
disposal, that is only one additional condition that we can choose 
to minimize the discrepancy which has the form 

h = y" + y — 2 (1 — X) -\- Xx (1 — X) x- 

in our case. The collocation method applied to x = -i- yields the 

value X — y. The method of least squares used for the interval 

257 

0 ^ x ^ 1 yields the value X= ^ • Finally, the method of mo- 

14 

ments with the function ij) (x) = 1 yields the value X — — (let the 

reader verify all the calculations!). Substituting these values of X 
into (186) we obtain the corresponding expressions which approxi- 
mate the exact solution y = fairly well. For example, the 

exact solution is equal to 0.5699 for x = 0.5 whereas the approxima- 
tions yield, respectively, the values 0.5714, 0.5681 and 0.5682. 
Hence the error is about ±0.3 per cent. 

30. Simplification Method. This method is widely used in prac- 
tical calculations especially when we want to get a crude estimation 
of the result. The method includes such techniques as simplification 
of the original equation by dropping terms that are comparatively 
small, replacing slowly varying coefficients by constants and the 
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like. After this procedure is carried out we can arrive at an equation 
of one of the integrable types. Then integrating the equation we 
obtain a function which can be regarded as an approximate, solution 
of the original equation. At any rate, such an approximation often 
correctly describes the qualitative character of the behaviour of the 
exact solution. After this “zero approximation'’ has been found we 
can often use it for determining corrections which compensate the 
simplifications we have performed, and thus we can find a “first 
approximation” and so on. 

In case an equation involves parameters (e.g. masses, linear sizes 
of the objects under consideration etc.) we should take into account 
that the terms which we regard as being small can be different for 
different values of the parameters and therefore, generally, the 
simplification of an equation must be performed in different ways 
for differ ent values of the parameters involved. Besides, it is some- 
times necessary to break the interval of variation of the independent 
variable into several parts and simplify the equation on each of the 
parts by means of specific techniques pertaining to that very interval. 

It is especially useful to simplify an equation in the way described 
above if we use certain simplifications in the very process of dedu- 
cing the equation or if the degree of accuracy with which the quan- 
tities in question are determined is not sufficiently high. For in- 
stance, we should by all means drop such terms entering into an 
equation that are less than the admissible error of determination 
of its other terms. 

For example, let us consider the problem 

= (1S7) 

y r (0) — o, o<.t<2 

The coefficient in y changing slowly, we replace the coefficient by 
its mean value (see Sec. XIV.5) which is equal to 


1 

2-0 


I 


1 

1 + 0. la: 


dx = 


1 In (1 + 0.1a:) 

2 0.1 


2__ In 1.2 
0 0.2 “ 


0.911 


Besides, let us drop the third summand which is comparatively 

Thus we get the equation y" + 0.911?/ = 0 whose solution satis- 
fying the initial conditions is 


y — cos 0.954c • (188) 

The form of this approximate solution justifies the procedure of 
dropping the last term in the equation because the ratio of the third 
term to the second term is of the order of 0.2 if < 0.2 and there- 


37-0141 
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fore the first term and the second term must approximately “cancel 
out”. Let us now bring in a correction connected with the last sum- 
mand. To do this we substitute approximate solution . (188) into 
the summand and retain the averaged value of the coefficient in y : 

y" + 0.911i/ = — 0.2 cos 3 0.954a; = 

= — 0.05 cos 2.86a: — 0.15 cos 0.954a: 

[here we have used formula (VIII. 14)]. Now according to the 
methods of Sec. 18 we obtain the solution 

y — 0.993 cos 0.954 x — 0.079a: sin 0.954a: -f- 0.007 cos 2.86a: 

satisfying the initial condition (check up the calculations!). 

The difference between the last result and zero approximation 
(188) is not large and therefore our conclusions concerning the roles 
of different summands entering into equation (187) are confirmed 
again. At the same time we see that the third term of equation (187) 
has introduced a certain correction into the solution. [Think how 
we can determine the correction that takes into account the variabi- 
lity of the coefficient in y entering into equation (187).] 

Considerations of the above type are often not sufficiently rigo- 
rous and may lead to mistakes. Therefore they should be applied in 
accordance with common sense. Then, comparatively often, they 
nevertheless result in approximate solutions that can be used for 
practical purposes. 

31. Euler’s Method. Now we proceed to study some methods of 
numerical integration of differential equations. These methods are 
particularly applicable to the cases when none of the above methods 
of “approximate analytical integration”, that is methods of con- 
structing approximate formulas for solutions, turns out to be ineffec- 
tive, especially if a solution must be calculated with a great accuracy 
for a large interval of variation of the argument. Besides, the methods 
are used when equations are solved by means of electronic computers. 

It is often advisable to combine methods of approximate and 
numerical integration. For instance, if an initial condition for the 
equation 

y" + (1 + e-*) y = 0 

is given we can apply Taylor’s formula for small values of x (see 
Sec. 24), one of the methods of numerical integration for intermediate 
values of x and, finally, we can simply drop the term e~ x for large 
values of x. 

We shall consider four methods of numerical solution of first- 
order diSerential equations which are most frequently used. The 
methods are easily extended to systems of first-order equations to 
which we can also reduce higher-order equations. In courses on 
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approximate calculations one can find some other methods. We 
especially recommend [3], [9], [11] and [34]. 

Euler’s method is visual and simple but not sufficiently effective. 
The reader must understand it well because many important and 
effective methods used in different branches of mathematics are 
essentially the development of the Euler method. 

Euler’s method is based on the procedure of direct replacement 
of the derivative entering into a differential equation by the ratio 
of finite differences (difference quotient) which were considered in 
Sec. V.7. Suppose we have an initial-value problem of the form 

y' — f y)> y teo) = yo ( 189 ) 

For simplicity’s sake, let us take a constant step h along the x-axis. 
We introduce the notation 

x 0 + h = x 15 x 0 + 2 h = x«, x 0 -f 3 h = x 3 , ... 

Approximate values y ( Xk ) will he designated as y k . To find the 
values we replace the derivative in the equation by the difference 
quotient. Hence we have 

~ir=f( x k, yk) 

i.e. 

— I — — f{xk, yk) 

and hence 

y/i+i = yh + / (x kt Vh) h (190) 

Beginning with y Q and making k assume, in succession, the values 
k — 0, 1, 2, . . ., we apply formula (190) and compute the values 

Vi = yo + / (*o» y o) h, y 2 = y % + / (*i, yi) h, ... 

Euler’s method has a simple geometric meaning which is illustra- 
ted in Fig. 301 where the integral curves are also depicted. We see 
that the geometric significance of the method lies in the fact that 
we draw the line segment M 0 M % tangent to the desired integral 
curve through the point M 0 instead of the integral curve itself which 
is unknown. In doing so we follow the direction of the direction 
field at the point Mo- We likewise draw the corresponding line 
segment through the point according to the direction of the 
tangent prescribed by the direction field at Mj and so on. Thus we 
obtain Euler’s broken line (polygonal line) which approximately 
represents the integral curve that would appear if the step h were 
infinitesimal, that is if we continuously “corrected” the direction 
of the broken line. 


- 37 * 
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The order of the error of Euler’s method can be easily estimated. 
Indeed, taking advantage of formula (190), we can replace the in- 
crement of the solution by its differential y' k Ax = f (x h , y h ) h. 
This leads to an error of the order of h 2 [see formula (IV.49)]. When 
we construct the solution on an interval ( x 0 , x) and break the inter- 
val into n parts we have h = , and therefore the resultant 

error will be of the order of nh 2 = ( x ~ x °)- . Hence, to decrease the 
error 10 times, that is to determine one more decimal digit, we must 



also increase the number of points of division 10 times which leads 
to a considerable increase of the amount of calculations. Here lies 
the disadvantage of the method. 

There is a specific feature of Euler’s method which is also charac- 
teristic of other methods of numerical integration of differential 
equations. We have already noted (see Sec. 7) that a solution of a 
differential equation may approach infinity for a finite value of x 
when it is continued along the x-axis. But at the same time it is 
clear that an approximate solution constructed in accordance with 
Euler’s method remains finite for all values of x. To describe the 
behaviour of the solution in such a case correctly we can apply the 
following technique: if we see that there is a considerable increase 
of the solution in its absolute value we can perform the substitution 

y — — in the differential equation. Then if further integration 

shows that z passes through a zero value attained at a certain point 
x = a, this means that | y (a) | = oo. 

32. Runge-Kutta Method. We now demonstrate a simpler variant 
of this method which specifies Euler’s method. Suppose that 
an approximate value y h of a solution for x = x h has already been 
computed. Then we can find y* +1 by calculating in accordance with 
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the formula 

( h fkJ 1 \ 

Xh+- 2~, l/k + — ) * 

r/fe+j = Z/ft ctft/i (191) 

The geometric meaning of these calculations is illustrated in 
Fig. 302. Namely, every subsequent segment MkMk+ i of the broken 
line approximating the integral curve is constructed in the follow- 
ing way. We first draw the line segment M according to the 
slope fk of the direction field at the point Mk- jnst as in Euler’s 
method. But here we do not limit ourselves to the result thus ob- 
tained and proceed to determine the slope of the field at the mid- 
point iV ft of the segment and to draw the new segment 
in this direction. Thus, in this way we specify the slopes of the 
segments of the broken line approximating the integral curve. 

Even the geometric illustration shows that this method is more 
accurate than Euler’s method because here we take into account 
the change of the slope of the field along the interval Xk ^ x ^ £/ i+1 . 

This can also be confirmed by calculations. By Taylor’s formula 
(see Sec. XII. 6) we get 

a h = f (x h , Jjk) -F fx (xk, i/ft) -j- 4- f v (x h , l/h) -f 0 (/ft 2 ) 

where O ( h 2 ) designates a quantity which is bounded in comparison 
with h 2 (compare with Sec. III. 11). As in Sec. 24, we find 

l/h — / {Xk, i/ft) = fh, l/h~fx{Xki i/ft) fy (Xk, i/ft) i/ft , 
i/ft+i = i/ft + ocft/i = i/k~r f (Xk, y^h + fxiXk, i/ft) -y--r 

+ fv (Xk, i/ft) fk ^ + 0 (/ft 3 ) = y k -r y'kh -f- —■ h 2 ~ 0 ( h *) 

(192) 

But the exact value of the solution satisfying the condition 
V ( Xh ) — yk is equal to 

V fa + h) = 7/ ft + y' h k + — h 2 + 0 (/i 3 ) (193) 

[see Taylor’s formula (IV.50)]. Comparing formulas (192) and (193) 
we see that the values y ( x & -f- h) and y 4 can differ only in terms 
whose order of smallness is not less than that of /ft 3 . From this, as 
m the end of Sec. 31, we conclude that the resultant error is of the 

order of - or, which is the same, of the order of /ft 2 . Hence, if we 

increase the number of points of division 10 times the degree of 
accuracy will increase 100 times. 
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A still more precise result will be obtained if we calculate accor- 
ding to tbe following scheme: 

( k itJ 1 \ 

%k t 2 ~ * z~ ) ’ 


= f {xk 


h , 

2 1 y* ‘ 2 ) 


y k = f(xk-rh, yk-r$hh), 


yk+i = Vk -f 4 - (/* + 2a ft 4- 2pA + Vk) h 


Calculations similar to (192) show that here the error made at every 
step does not exceed a quantity of the order of h 5 and therefore the 
resultant error is of the order of 7i 4 . Consequently, if we increase 
the number of points of division 10 times the degree of accuracy 
will increase 10,000 times. 

33. Adams Method. This method was introduced in 1883 by the 
English astronomer J. C. Adams (1819-1892). The method is based 
on Newton’s second interpolation formula (V.29) which is applied 
to the derivative y' (x) of the solution beginning with a certain 
value x h = x 0 -f- kh of the argument: 


y' ( x) = y'k -i- &y'k-i 


x—x k , b 2 y'h - 2 x ~ x h / 

h 2 ! h V 


A 3 l/ft _3 x—x h 1 x — x k » j x — x k 
3 ! h \ h ' 1 ) V h 




(194) 


In applying the formula we have substituted x k — x — — (x — Xp) 
for t — Xk+i — x and, respectively, y' h for the value yh+i- Besides, 
we have replaced the sign of approximate equality by the sign of 
exact equality although formula (194) is, of course, approximate, 
and its error is of the order of A 4 y', that is of the order of h 1 (see 
Sec. V.7). Integrating formula (194) from x h to x k ^ = x k + h 

(with the help of the substitution — s) we receive 

Uh-ri =yk-r (y* -r ~2 ^Uh- 1 -r ~^2 &?y'h-3 j 7i (195) 

(check up the calculations!). 

The error of formula (195) is the result of the integration of the 
error of formula (194) and therefore it is of the order of h 5 (why is 
it so?). 

Formula (195) is utilized in the following way. First we find 
the values 

Vi ~ y fco + h), y z = y ( x 0 + 21i) and y 3 = y ( x 0 + 3 h) 

by means of some other method, for instance, by Taylor’s formula 
(see Sec. 24) or by the Runge-Kutta method (see Sec. 32). Then we 
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compute the corresponding values 

y' 0 — f (xo> I/o)i y[ ~ f ( s i» Vi)' y'z ~ f ( x 2 ) Vz) 
and y’ s = f (x 3 , y 3 ) 

This enables us to determine 

hyl^y'i—y'o’ /!> ^y'v A 2 y' = Ay(— &y'o, A 2 y 4 , 

A 8 ^ = A*yi-A Vo 

Further, putting fc = 3 in formula (195) we calculate y 4 , and then 
using this value we find y| = / (x 4 , y 4 ), Ay' = y[ — y' 3 , A 2 y( and 
A 3 y(. Then putting A; = 4 in formula (195) we calculate y 5 . After 
that we use y 5 for finding y ' = / (£ 5 , y 5 ) etc. The calculations are 
performed according to the following scheme: 


<r 

y 

y=f(r,y) Ay' A z y' A 3 y' 

T k-3 

y*-3 

y'k-3 &/** 

*k-Z 

Yk-2 

y'k-z ‘ Aylz 

*k-t 

y*-t 

rt-t rwtffPftTf 


Ll. 

y* fj Ay\ 1 

lE/Sf 


; Yk+i 


34. Milne’s Method. This method can be obtained by means of 
Newton’s first interpolation formula (V.27). It is one of the most 
effective methods. Here we give only the final result, that is the 
scheme for performing calculations implied by the method (which 
was introduced in 1926). 

The calculations are performed according to the following for- 
mulas: 


Vh+i — yn-3 H — g- (2yft_2— yh— i + 2yft) A 


y'h+i = / (xft+i, yft+i) 

l Jk+i = y*-i + y ( y'h - 1 + 4y* + y* + i) 


r (* = 3,4,5,...) 


(196) 


where y\ = / (a:,-, y,). Here, as_ in the Adams method, the values 
y°' y^ y^i y 3 Should be found in advance by means of some other 
method. After the valueshave been found we put k = 3 in formula 

(196) and calculate y 4 , y 4 , y 4 , in succession. Then, putting k = 4 
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we find i/ 5 , ?/', y s and so on. The values y 4 , y s , i/ 6 , ... thus deter- 
mined are the approximate values of the solution y ( x ) for x = x 4 , 
x s , x 6 , ... where x t = x Q -f- Hi- 
lt turns out that the absolute error which occurs when we calculate 
ijk-i according to this method is approximately equal to 

I yh+i— i/fe+i 1 
29 

Therefore when performing the calculations we can simultaneously 
check whether the error lies within the limits of the degree of accu- 
racy we have chosen for our calculations. If we see that at a certain 
stage of our calculations the error falls outside the prescribed limits 
we must decrease (from the corresponding value of x onwards) the 
step h taking into account that the resultant error of the method 
is of the order of A 4 . 
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Multiple Integrals 


§ 1. Definition and Basic Properties of Multiple Integrals 

1. Some Examples Leading to the Notion of a Multiple Integral. 
We now consider a solid {Q) with the density of mass distribution p. 
The density p can be variable, that is different at different points 
of the solid. Let the function p = p (M) (where M is a point of Q) 
be known and let it be necessary to determine the whole mass m 
of the solid. An analogous problem 
for the case of a linear mass distribu- 
tion was solved in Secs. XIV.'l, 2. 

Let the reader read these sections 
again before proceeding to study 
multiple integrals. 

The spatial case is treated quite si- 
milarly. Let us mentally divide the 
region (Q) into n parts (subregions) 

(A^), (AQ 2 ), . . ., (AQ n ), as in 

Fig. 303. Let the symbols (AQ^) ( k = 

= 1, . . ., n ) designate the parts 

themselves, and let AQ& designate 
their volumes. Now, choose an arbitrary point Mh (k = 1, . . ., 7 ?) 
in each of the subregions (A£2 ft ). The points M 2 , . . ., M n are 
also shown in Fig. 303 which represents a spatial picture. If the 
parts (AQfc) are sufficiently small we can regard the density as 
being constant within each of the parts without an essential error. 
Then the mass m ( AS2l ) of the first part (A£Ij) can be computed as 
the product of the density by the volume, i.e. as p (Mfi) AQ t . 
The mass m ( A q 2 ) of the second part is found similarly and so on. 
Thus, we obtain 

™(B) ^ p {M \) AQi + p (M 2 ) A& + . . . + p (M n ) AQ n = 

= S P ( M h) AQ, ; 

k=i 



Fig. 303 
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This is an approximate equality since the densities of the parts 
are nevertheless variable. But the smaller the parts, the greater 
the accuracy. Hence, passing to the limit, as AQ& -> 0 (ft = 1, . . . 

. . ., n), we obtain the exact equality 

n 

m (Q) = lim 2 P( M k) (1) 

h= 1 

The limit is taken here in a process in which not only the volumes 
but also all the linear sizes of the parts of the partitions tend to 
zero. Besides, it is supposed that the limit does not depend on the 
way of partitioning (Q) into subregions. 

Reasoning in a similar way we can conclude that if an electric 
charge is distributed over a solid (Q) with density a the magnitude 
q of the charge is found by means of the formula 

q = lim 2 G(M k )AQ k (2) 

*=i 

where the notation is understood as before. 

A mass or a charge can be distributed not only in a volume but 
also over a surface or a curve. When we say that a mass or a charge 
is distributed over a surface this, of course, means that one of the 
dimensions of the domain of space in which the quantity is distri- 
buted is very small relative to the other two dimensions. The distri- 
bution along a curve is understood similarly. Formulas (1) and (2) 
remain true in the latter cases if the density p (or a) is understood 
as a surface (areal) density (i.e. mass or charge per unit area) or as 
a linear density (i.e. related to unit length) and AQ& designates 
the area or the length of the part (AQ&) (ft = 1, . . ., n), respecti- 
vely. In the general case we call AQ ft the measure of the subregion 
(AQfe) understanding it as volume, area or length depending on 
whether we consider spatial regions, surfaces or curves. 

2. Definition of a Multiple Integral. The similarity between for- 
mulas (1) and (2) indicates the advisability of the general definition 
of a multiple integral given below. For definiteness, let us consider 
integrals over three-dimensional regions. The measure of such 
a region is understood as its volume. 

Suppose we are given a bounded (finite) region (Q) in space. Let 
a function u = / (M) be defined over (Q) and let the value f (M) 
of the function be finite at each point M of the region. To compose 
an integral sum we arbitrarily break up the region (Q) into subre- 
gions (AQ ( ), (AQ 2 ), . . ., (AQ n ) and take an arbitrary point 
Mk (ft = l, . . ., n) in each of them. Then taking the values / (AT ( ), 
f (M 2 ) , . . ., f {M n ) of the function / assumed at the points Mi, 
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Mo, ■ • •, M n we write down the integral sum 

2 u fc AQ ft = 2 f(M h )&Qk (3) 

fi=l h=l 

where AQ k (k = 1, . . rc) designates, as before, the volume of 

the subregion (AQ ft ). . , . , „ ,, 

The limit of the integral sum taken in a process m which all the 
linear sizes of the subregions entering into the partitions of the 
region (Q) are unlimitedly decreased is called the integral of the 
function / over the region (Q). Denoting the integral by the symbol 

| u dQ we can write 

(S) 

^ i/dQ = [ / (M) dQ = lim 2 / ft) A ^ft (A) 

(Q) (fi) ft=i 

(compare this with the basic definitions given is Secs. XIV.2 and 
XIV.22). (Q) is called the region (domain) of integration. 
Consequently, formulas (1) and (2) can be rewritten as 

m = j pdQ and q = j a dQ 
(0) (fi) 

As in Sec. XIV.2, we can integrate both continuous and discon- 
tinuous functions. The existence of limit (4) can be proved for any 
finite function (under some additional conditions) defined in a finite 
region without referring to the physical meaning of an integral 
(by means of purely mathematical considerations). Besides, it is 
not the boundedness of the region that is essential for such a proof 
but the boundedness of its measure. Fig. 264 represents an example 
of a region which extends to infinity 'but has a finite measure. 

The definition of an integral taken over a surface (which can be 
plane or curvilinear) or along a curve is formulated quite similarly. 
In these definitions we must, of course, take the areas or the lengths 
of the subregions instead of volumes when forming a partition of 
the region. In particular, an integral of this type along a curve is 
nothing but a line integral of the first type taken with respect to 
arc length which was studied in Sec. XIV.22. Integrals over a volume 
and over a surface are referred to as multiple integrals. The former 
are also called triple integrals and the latter are called double in- 
tegrals. These terms will be explained in § 3. 

3, Basic Properties of Multiple Integrals. The basic properties 
of a definite integral proved in Secs. XIV.4, 5 are implied by the 
definition of an integral as the limit of the integral sum (see 
Sec. XIV.2) . Therefore we can easily extend these properties to multiple 
integrals. We enumerate them here. 
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1. The integral of a sum equals the sum of the integrals of the 
summands (the same is true for the difference): 

C (u, zb Hz) dQ= f U) dQ dz f u 2 dQ 
(Q) (Q) (Q) 

2. A constant factor can be taken outside the sign of integration: 

j CudQ-—C udQ (C = const) 

<U) (Q) 

3. The theorem on a partition of the region of integration: for 
any partition of the region (Q) into parts the integral over the whole 
region is equal to the sum of the integrals over the parts. For defi- 
niteness, if (Q) is divided into the parts (Q t ) and (Q 2 ) we have 

^ u dQ = j udQ — | udQ 

(Q) (Ui) (fi 2 ) 

4. The integral of unity is equal to the measure of the region of 
integration: 

(Q) 

5. If the region of integration degenerates, that is its measure 
turns into zero, the integral itself becomes equal to zero. 

When formulating properties 4 and 5 we speak about the measure 
of a region (see Sec. 1) understanding it as volume, area or length 
depending on whether we consider triple, double or line integrals. 

6. If the variables in question have certain dimensions then 

[ j adQ] = [u].[Q] 

( 5 ) 

7. The case of symmetry. If the domain of integration can be divided 
into two symmetric parts and if the integrand takes equal values 
at the corresponding points belonging to these parts, the integral 
over the whole domain is equal to the doubled integral over each 
of these parts. If the integrand is multiplied by —1 when we pass 
from any point belonging to one of the parts to the symmetric point 
in the other part the integral over the whole region is equal to zero. 

It is sometimes possible to break the domain of integration into 
a greater number of equal parts in order to reduce a given integral 
•to an integral over a domain of a simpler form than the original 
domain. 
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8. It is allowable to integrate inequalities: if Wi ^ n 2 then 

f \ u 2 dQ (5) 

(2) (Q) 

The last inequality turns into the strict equality if and only if 
Ul = u 2 provided both functions u ± and u 2 are continuous. But 
for the "case of discontinuous functions integrals (5) can nevertheless 
coincide even when the identity u\ = u 2 is violated at points belon- 
ging to degenerated subregions which have a zero measure because 
such a violation does not affect the value of the integral (compare 
this with the corresponding property of an integral over a line 
segment). 

9. An integral satisfies the inequalities 

UminQ^. ^ U dQ-^.U max Q ( 6 ) 

( 2 ) 

10. Inequalities (6) are connected with the notion of the mean 
value u of a function u over a region (Q) which is defined by means 
of the formula 

^ udQ' — j udQ (u = const) 

( 2 ) ( 2 ) 

similar to that given in Sec. XIV. 5. Thus, we have 

u = ~ j udQ and ^ u dQ — uQ 

( 2 ) ( 2 ) 

Inequalities (6) imply that u mtn ^ u ^ u max . 

All these properties can be visually illustrated if we regard the 
function u as the density of a mass distribution and the integral as 
the mass itself. 

11. There is an inequality of the form 

J udQ |< j (njdQ 

( 2 ) ( 2 ) 

which is similar to that given at the end of Sec. XIV. 5. 

4. Methods of Applying Multiple Integrals. There are two basic 
schemes of applying multiple integrals to physical problems (compare 
with Sec. XIV.6). The first one is based on representing the quantity 
in question in an approximate form of integral sum (3) in which 
we then pass to the limit, as it was shown in Sec. 1. The second 
scheme is based on composing the “element” (differential) of the 
quantity. We now briefly discuss the latter scheme (we shall dwell 
in more detail on the scheme in § 2). 
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Suppose we are interested in a quantity q which corresponds, for 
definiteness, to a spatial domain (Q) [the situation is similar to 
that considered in Sec. 1 where we had a mass m or a charge q dis- 
tributed over a three-dimensional region (Q)]. Let us form the expres- 
sion dq — (p (M) dQ. which approximately describes an infinitesimal 
portion of the quantity q corresponding to an infinitesimal volume 
dQ. placed at an arbitrary point M. This expression possesses the 
following properties: (1) it is directly proportional to the volume 
and (2) it differs from the true value A q of the quantity q correspon- 
ding to the portion dQ in an infinitesimal term of higher order of 

smallness relative to Aq. Now, summing 
all the quantities dq over all the “ele- 
ments of volume” dQ within the domain 
(Q) we obtain 

9 = 9(2)= j <p (M) dQ, (7) 

( 2 ) 

As an example of the first method, let 
us consider the expression of the static 
moment (moment of mass) of a mate- 
Fig. 304 rial body with respect to a plane ( P ). As 

is known from mechanics, the static 
moment of a finite system of material points relative to a plane ( P ) 
is expressed by the formula 

S(P) — S m k z k 
k 

where m k is the mass of the Arth point (Jc — 1, 2, . . n) and z & 
is the coordinate of m h reckoned along an axis drawn perpendicular- 
ly to the plane ( P ) (see Fig. 304). 

If the mass is distributed over a spatial region (Q) we divide the 
region into n parts (AQ0> (AQ 2 )> - • (AQ n ) and consider an appro- 
ximate model in which the mass of each of the parts is concentrated 
at one of its points. This yields the approximate expression 



fc=i 

which can be written in more detail as 

n 

S(D P (Mk) z (Mk) AQ ft 
Passing to the limit we thus obtain 

Ajpj = ^ pzd£2 (8) 

( 2 ) 

The above calculations correspond to the first method mentioned 
< at the beginning of this section. 




MULTIPLE INTEGRALS 


591 


If we wanted to use the second method we should write the expres- 
sion of the element of moment of mass. 


cfiS'(p) = pz dQ 

Summing, we should deduce the same formula (8). 

Knowing the static moment we can readily find the coordinate z 
of the centre of gravity of the solid in question: 

5 p= dQ 

„ £(P) (Q) 

* c_ rn | pdQ 
( 2 ) 


The expression is simplified in the case when the solid is homogene- 
ous, that is when p = const. Then the centre of gravity is referred 
to as the geometric centre of gravity, and we have 


p $ zdQ 

p \ dQ 
( 2 ) 



( 9 ) 


The other coordinates of the centre of gravity of the body (Q) are 
found similarly. The static moments and the coordinates of the 
centre of gravity of a plane geometric figure with respect to a straight 
line lying in the plane are found in like manner. 

5. Geometric Meaning of an Integral Over a Plane Region. Such 
an integral, unlike other integrals defined in Sec. 2, can be directly 
interpreted geometrically. Its geometric meaning is similar to that 
of an ordinary definite integral conside- 
red in Sec. XIV.2. Let us be given an 
integral of the form 

1= f udQ (10) 

(£ 2 ) 

where (Q) is a domain lying in a plane / 

( P ) (see Fig. 305). Let us draw the 1 
u-axis perpendicularly to the plane and 
construct a line segment of length u (M) Fig- 305 

parallel to the u-axis and passing through 

a point M belonging to the domain (Q). For simplicity’s sake we 
now consider positive values of u; then the segment is drawn in the 
positive direction of the u-axis and the end-point N of the segment 
lies above the plane {P). If u < 0 the segment is drawn in the nega- 
tive direction of the u-axis. When the point M runs throughout 
the domain (Q) the corresponding point N describes a surface (S) 
which is the graph of the integrand (such a graph was constructed 
m Fig. 190). The surface ( S ) together with the plane figure (Q) and 
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the cylindrical surface formed by the line segment parallel to the 
zi-axis and drawn through each point of the contour bordering the 
domain (Q) bound a cylindrical body. 

The geometric meaning of integral (10) lies in the fact that it is 
equal to the volume of the cylindrical body. Indeed, the element of 
volume corresponding to a surface element dQ of the domain (Q) 
(see Fig. 305) containing a point M can be regarded as a right cylin- 
der with base dQ. and height u ( M) to within infinitesimals of higher 
order. Hence, this volume is approximately equal to dV = u dQ. 
Summing up these elements of volume we arrive at the formula 

V = J udQ=I 

(U) 

which is what we set out to prove. 

If u assumes negative values as well the volumes of the parts of 
the body lying under the plane ( P ) enter into the result with the 
sign minus (this situation is quite analogous to the one considered 
in Sec. XIV. 2). 

By analogy with formula (XIV. 28), we can pass from the volumes 
of cylindrical bodies to the volume of a body having an arbitrary 
form. If (Q) is the projection of the body on a plane ( P ) and h = 
= h (M) is the length of the line segment formed by the intersection 
of the body with a straight line which is perpendicular to ( P ) and 
passes through the current point M of the domain (Q), we have 
the expression 

V= j hdQ 

(Q) 

for the volume of the body. 

§ 2. Two Types of Physical Quantities 

6. Basic Example. Mass and Its Density. We now consider a mate- 
rial body whose density can be variable in the general case. Let us 
disregard the molecular structure of the body and consider its mass 
to be continuously distributed in space. According to this model 
such a body has a certain density p at each of its points M, and 
hence p is a function of a point of the form p = p (il 1) (see Sec. 
IX. 9). In contrast to it, the mass cannot be regarded as a function 
of a point because the mass of a separate point is equal to zero. 
The mass is a quantity which is distributed in space which means 
that to each region (Q) (regarded as being mentally taken out of 
the space) there corresponds a certain value jn (Q) of the mass. As 
before, (Q) is regarded here as a symbol designating the region itself, 
not its volume. 
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The connection between the mass and the density is as follows. 
Let the value m (Q) of the mass corresponding to each domain (Q) 
in space be known. Then the ratio 


is referred to as the mean (average) density in (Q). The symbol Q 
entering into the ratio designates, as in Sec. 1, the volume of the 
domain (Q). Hence, Q is a quantity having the dimension of volume. 
[By the way, we shall sometimes refer to (Q) as a volume when there 
is no danger of misunderstanding.] 

To obtain the density at a certain point M we must pass to the 
limit (compare with Sec. IV. 1) hy making (Q) contract to the point M : 

p (M) — - lim p av — lira -yp- (H) 

(Q)-Or (SI)-kM 

This process is analogous to that of calculating a derivative. When 
we write (Q) M we mean that (Q) is unlimitedly contracted to 
the point M. If (Q) -v M we have, naturally, Q -»- 0 but the requi- 
rement that (Q) ->■ M is stronger than the condition Q -*■ 0 (why?). 
Thus, the density of mass distribution at a point is the mass of 
an infinitesimal volume related to unit volume. 

Conversely, if the value p (M) of the density is known for each 
point iLT the mass 7n ( Q) corresponding to any part (Q) of space can 
be found on the basis of Secs. 1, 2 as the integral 



(£ 2 ) 


If we now take into account the molecular structure of the sub- 
stance, i.e. if we return from our idealized model to reality, we 
cannot even mentally contract the volume (£2) to a point unlimited- 
ly. In this case, instead of formula (11), we must write the expression 


P (M) = 


ct (AU) 

AQ 


where (AQ) is a volume containing the point M which is regarded 
as being practically infinitesimal (see Sec. III.l). Consequently, 
the density of a real body at a point is the mean density in a volume 
which is sufficiently small relative to the dimensions of the body 
and at the same time sufficiently large relative to the molecular 
sizes. Hence, in these considerations we pass from the real discrete 
structure of a material body to its continuous model in which the 
density is obtained in the process of averaging based on the calcula- 
tion of the mean density corresponding to the volumes whose sizes 
were indicated above. 


38—0141 
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Further, when considering a continuous medium in our course, 
we shall always take the continuous model abstracting from the 
molecular structure of substance. 

7. Quantities Distributed in Space. There is a number of physical 
quantities which are analogous to mass in many respects. They 
possess the properties considered in the above example of mass 
distribution. Examples of such quantities are an electric charge 
in a dielectric, quantity of heat, energy of an electromagnetic field 
and the like. There is a feature which is common to all such quan- 
tities, namely, they are all distributed in space. In the general case 
we say that a quantity q is distributed in space if to each part (Q) 
mentally isolated from the space there corresponds a certain value 
q w of the quantity. There is only one general requirement which 
we introduce here: the quantity must be additive, i.e. for any par- 
tition of (Q) into parts the value of the quantity corresponding to 
(Q) must be equal to the sum of the values of the quantity corres- 
ponding to the parts. Hence, if, for definiteness, (Q) is divided into 
two parts (Q t ) and (Q 2 ) we must have g ( q> = q i ai) -f- q ( o 2) . 

A quantity distributed in space possesses a certain density at 
each point. Thus we can speak about the density of an electric 
charge, of field energy and so on. In the general case the density 
<p is defined by analogy with formula (11): 


<p (M)= lim 

(Q u 


( 12 ) 


The ratio under the sign of limit in (12) is the mean (average) density 
of the quantity q in the volume (Q). The density tp = q> (ill) is then 
a function of a point. The density of the quantity q at a point M 
equals the value of q (corresponding to an infinitesimal region “placed 
at the point M") related to unit volume. 

Conversely, if the density cp (M) of a quantity q is known the 
quantity q itself is found by the methods described in Secs. 1, 2: 


9(0) = lim 2 <P TO AQ;,= j cp (M) dQ (13) 

fc=l (Q) 

where the limit is taken in the process in which the linear sizes 
of the parts forming partitions of the region are decreased unlimi- 
tedly. 

In the general case q and (p can take on the values of any sign. 
Let us rewrite formula (12) so as to stress that we deal with infi- 
nitesimal volumes. To do this we substitute (AQ) for Q: 


g (AQ) 


g (AQ) 

AQ 


- = <p(M) + a 


AQ 


» cp (M), i.e. 
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where a is infinitesimal when (AQ)-^iW. This implies 

9(aq) = 9 (M) AQ -f- aAQ 

Thus, the value of q corresponding to a small volume (AQ) is di- 
vided into two parts the first of which is directly proportional to 
the volume AQ while the other is of higher order of smallness. The 
former summand is therefore called the differential or the element of 
the quantity q (compare with Secs. IV. 7, 8): 

dq = cp (717) AQ (14) 

This implies the physical meaning of dq: it is the value of q which, 
would correspond to the volume (AQ) if the density were constant 
throughout (AQ) and were equal to the density at the point M. In 
reality, Aq, i.e. q ( A n>, does not equal dq in the general case and di- 
ffers from it by an infinitesimal of higher order of smallness. Hence, 
Aq and dq are equivalent infinitesimals when (AQ) M (see Sec. 
III. 8). If it is possible to neglect such infinitesimals of higher order 
we simply say that dq is the value of q corresponding to an infinite- 
simal volume (AQ). We also call it an infinitesimal mass, an infi- 
nitesimal charge etc. in such cases. 

The volume can be regarded as a special case of a quantity q 
distributed in space, that is we can put g ( n) = Q. The correspon- 
ding density is then equal to unity (as “a volume related to unit 
volume - ’). Hence, formula (14) implies 

dQ = AQ 


in this case. Therefore formula (14) can be put down in the form 


dq = cp (M) dQ (15) 

which is preferable (this resembles the corresponding formula in 
Sec. IV. 9). 

Thus, summing up, we can write the basic formulas connecting 
a quantity q = q^ Q) distributed in space and the corresponding 
function of a point (density) <p — cp (M) in the form 




M 


and g ( n) = j <p (M) dQ 
(«) 


These formulas enable us to pass from one representation of a 
quantity to the other in all cases. The density is found by means 
of differentiating the quantity and the quantity itself is found by 
integrating its density. 

It should be noted that there are some quantities differing in 
t ieir natuie from such quantities as masses and charges which can 
also be considered to be distributed in space or over a surface or 


38 * 
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a curve. For instance, the static moment or the moment of inertia 
of a material body satisfies the conditions formulated in the defini- 
tion given at the beginning of Sec. 7 and thus they can he regarded 
as being distributed in space although they depend on the choice 
of a plane or of an axis. In practical problems we do not distinguish 
between such quantities belonging to different classes and simply 
write the expression dq according to the considerations given in 
Sec. 4 and then perform the summation (or integration) of the ele- 
ments on the basis of the additivity law and thus arrive at an ex- 
pression of form (7). 

A quantity can be distributed not only over a volume but also 
over a surface (plane or curvilinear) or along a curve. All the results 
obtained in this section remain valid for these cases if we interpret 
(Q) not as a three-dimensional domain mentally taken out of space 
but as a part of a surface (i.e. a region on a surface) or a part of a 
curve and understand £1 as the area or length of the part, that is 
as its measure (see Sec. 1). 


§ 3. Computing Multiple Integrals in Cartesian 
Coordinates 

8. Integral Over Rectangle. We now consider an integral 

J = J udP. (16) 

( 2 ) 

where (Q) is a rectangle bounded by coordinate lines of a Cartesian 
coordinate system arbitrarily chosen in a plane (see Fig. 306). The 
rectangle is described by inequalities a ^ x ^ b and c ^ y ^ d 

where a, b, c, d are some constants. 
When forming an integral sum S for 
integral (16) in the Cartesian coor- 
dinate system it is natural to break 
up (Q) into parts by means of 
straight lines parallel to the coordi- 
nate axes which divide the interval 
a ^ x ^ b into parts Ax,- and the 
interval c ^Ly ^.d into parts Ay,. 
Let us denote by u, h the value of 
the integrand u = u ( x , y) at a 
point belonging to the subregion 
adjoining the intersection of the ith 
vertical line with the fcth horizontal line (see Fig. 306). (Let the 
reader pay attention to the fact that the numeration of it/* does 
not coincide with the one used in the theory of matrices in Sec. XI 
where a ik designates the element of a matrix lying at the interse- 
ction of the ith row of the matrix with its ftth column.) 


£2 


Fig. 306 
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We then approximately have 

I S= 2 Ui k Ax t &yh (17) 

i, k 

where the summation is extended over all the subregions (small 
rectangles), i.e. over all the values of i and h (for instance, i = 1,2,... 
. . ., m and k = 1,2, . . ., n). 

A sum of form (17) with two summation indices is a two-dimen- 
sional integral sum. To compute it we must first perform summa- 
tion with respect to k for a fixed i, that is to sum up the summands 
forming a column (the ith column) of the table 


Mil 

U 2 t 

“3i 

• • • Umi 

«12 

U 02 

II 32 

• • • Hm2 

Min 

Uzn 


• • • Hmn 

for each fixed value of i 

(i = 

1, 2, 

. . ., m) and then perform the 


summation with respect to i. This results in 


S = .2 (2 UikAxiAijb) = 2 ( 2 ^ UikAyn) Ax t (18) 


where we have taken outside the brackets the common factor ente- 
ring into the summands of the inner sum. The transition from a two- 
dimensional sum to a two-fold iterated sum, that is from (17) to 
(18), can also be performed in the reverse order: the first, inner, 
sum can be taken with respect to i and the outer, repeated, sum 
with respect to k. 

If the divisions along the y-axis are sufficiently small the sum 
inside the brackets in (18) is close to the corresponding integral: 


n d 

2 u ik Ay h ~ udy^j. (19) 

C 

where the subscript i indicates that the value of x, which is fixed, 
is taken for the ith column of the above table. It follows that 

m d 

s ~ 2 ( j ud y) i Axi 

i=l c 

But this is also an integral sum for function (19) which depends on 
x. Hence, if the divisions along the x-axis are also sufficiently small 
we can write 

6 d 

s ~ j ( j « {x, y) dyj dx (20) 

G C 

because the sum is close to integral (20). 
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In the process of decreasing the subregions of the partitions equa- 
lities (17) and (20) become more and still more accurate and turn 
into the precise relations in the limit. Consequently, 

b d 

/= j udQ= j (J u (x, y) dy'j dx (21) 

(fi) a c 

Thus, to compute an integral taken over a rectangle with sides 
parallel to the coordinate axes we can first perform the integration 
with respect to y , for a fixed x, as the variable of integration y varies 
within the rectangle (the inner integration) and then integrate the 
result of the first integration (which depends only on x) with respect 
to x within the corresponding limits of its variation (the outer integra- 
tion). 

The reverse order of passing from sum (17) to an iterated two- 
fold sum (see above) would yield 

d b 

j udQ = I(J u (x, y) dx ) dy (22) 

(fi) c a 


Hence, when computing a double integral in Cartesian coordinates 
we have two ways of passing to a repeated (iterated) two-fold inte- 
gral (as it will be shown in § 4, an 
analogous passage to an iterated inte- 
gral can be performed in any coordi- 
nate system). It should be noted that 
one of the ways usually turns out to 
be more difficult for practical calcu- 
lations whereas the other is simpler. 
The transition from one of these ways 
to the other is referred to as the inver- 
sion of the order of integration. 

Formula (21) can be readily inter- 
preted geometrically. According to 
Sec. 5, integral (16) is equal to the 
volume of the solid depicted in Fig. 307. 
On the basis of Sec. XIV. 10 we can compute the volume by integra- 
ting the cross section area shaded in Fig. 307. Hence, we obtain 



o o a 

J = F= j 5 (a;) fH: = ^ ^ j u dy j dx 


(Q) 


The geometric meaning of formula (22) is analogous to that of (21). 
Hence, formulas (21) and (22) can be proved in a simpler manner 
but we have given a more complicated method of proving since 



MULTIPLE INTEGRALS 


599 


it can be automatically extended to multiple integrals of an arbitra- 
ry order. 

Bearing in mind formulas (21) and (22) we sometimes denote tbe 
original integral (16) as 

I — f C u dQ or I = ( ^ udxdy 
(Q) (fi) 

meaning that dQ = dx dy for a partition of the type shown in 
Fig. 306 when the divisions along the coordinate axes are sufficiently 
small. 

The computation of a repeated integral with constant limits of 
integration of form (21) becomes particularly simple when the inte- 
grand is a product of two factors each of which depends only on one 
variable of integration. Namely, if u {. x , y ) — / t ( x ) / 2 (y) we have 

b d b d 

1 - j ( J fi (z) h [y) dy) dx = j /i (x) ( J u (y) dy) dx = 

a c a c 

b d 

= J f l (x)dx- j f 2 (y) dy 

a c 

Thus we have obtained the product of two one-dimensional integrals. 

9. Integral Over an Arbitrary Plane Region. Let (£1), entering 
into integral (16), be an arbitrary plane figure lying in the x, y- 
plane. For instance, take the domain depicted in Fig. 308. The con- 
siderations given in Sec. 8 can be transferred to this case with some 
slight changes. Namely, instead of integral (19) we arrive at an 
integral of the form 

Vl <P2(*) 

j u (x. y) dy = j u (x, y) dy 

Vi <Pl(x) 

where y = = qq (x) and y = y 2 = cp 2 (x) are, respectively, the 

equations of the upper and lower parts of the boundary of the do- 
main (Q). The contour bordering the figure (Q) is divided into these 
two parts by the points A and B (see Fig. 308). Accordingly, the 
final result [which substitutes for formula (21) in this case] will be 
of the form 

b <P2(3C) 

/= | ( j u (x, y) dy) dx (23) 

(fi) O (PK.-C) 

Consequently, the limits of integration in^the inner integral are 
variable in the general case; they depend on the variable of integra- 
tion in the outer integral (i.e. on x in our case). The character of 
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this dependence is specified by the form of the contour. But the 
limits of integration in the outer integral are constant as before. 
They are specified by the maximal range of variation of x. Hence, 
•we see that the rule given after formula (21) remains valid for a 
domain (Q) of general form. 

Here we can also invert the order of integration, that is perform 
the first integration with respect to x and the second integration 



with respect to y. Then in place of formula (22) we arrive at a for- 
mula of the form 

d toflO 

j udQ = ( ( [ u{x,y)dx}dy (24) 

(G) c lJiRy) 

[let the reader find out what c. d, ihj (y), (y) are by examining 

Fig. 308]. 

It is sometimes necessary to break the domain of integration 
into several parts before setting up the limits of integration. 

For example, let it be necessary to invert the order of integration 
in the integral 

1 2x 

I = \ dx j / (x, y) dy 

0 X- 

which can be written at length as 

1 2x 

1 = j ( j /(*, y)dy ) dx (25) 

0 *2 

To do this we must first determine the geometric form of the domain 
of integration. In this case it is bounded by the lines x — 0, x — 1, 
y — £ 2 and y — 2x (see Fig. 309), and the first, inner, integration 
is performed along the line segments parallel to the y- axis, the 
segments being shown in continuous lines in Fig. 309. 
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After the inversion of the order of integration the inner integra- 
tion will be carried out along the line segments parallel to the x- 
axis shown in dotted lines in Fig. 309. We see that after the order 
of integration has been inverted the inner integration is performed 

from the straight line x = to the parabola x — Yv for y <C 1 

and from the straight line x =-|- to the straight line x = 1 for 

y > 1. The value of the variable y which corresponds to the divi- 
sion of the domain of integration into the two parts (in which the 



Fig. 311 



upper limits of the inner integrals differ) is found as the ordinate 
of the point of intersection of the parabola y = x" with the straight 
line x = 1, i.e. y — 1. Hence, after the inversion of the order of 
integration we obtain the sum of two integrals of the form 

i Vy 2 i 

I=\ dy f f(x, y) dx + ] d y \ f (x, y)dx 

i< J * a! 

° 2L 1 v_ 

2 2 


instead of formula (25). 

In more complicated cases it is sometimes necessary to divide 
the domain of integration into a greater number of parts. For example, 
to set up the limits of integration in Cartesian coordinates for an 
integral taken over the domain shown in Fig. 310 it is necessary 
to break the domain into five parts (what are these parts?). 

We now consider a simple example of an application of the double 
integral. By analogy with formula (9). we can easily deduce the 
formulas for the coordinates of the geometric centre of gravity of 
a plane figure (cr): 


x c 


X dx dy 

_ Id) 


J \ y dx dy 
(o') 


a 


y c — — * 


CT 


(26) 
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where a designates the area of the figure {a). Let the whole figure 
(a) entirely lie on one side of the ar-axis (see Fig. 311). Then the 
second formula (26) can be rewritten in the form 

b Vi b 

o-yc= ^ dx ^ y dy = ^ dx (-y- — = 


a vi a 

b b 



a a 


Multiplying both sides by 2it and recalling formula (XIV.35) 
of the volume of a solid of revolution we arrive at Guldin’s second 
theorem: if a homogeneous plane figure rotates about an axis lying 



in the plane of the figure and not intersecting it the volume of the solid 
of revolution thus obtained is equal to the product of the area of the 
figure by the distance covered by its centre of gravity. On the basis 
of the theorem we readily find the geometric centre of gravity of 
a semicircle of radius R (see Fig. 312): 


that is 



•2nijc 


y c = — 0 - 4 25f? 

10. Integral Over an Arbitrary Surface. Let us consider the 
integral 
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taken over an arbitrary surface (Q) which can be curvilinear in tl 
general case (see Fig. 313). To compute it in Cartesian co ordinal 
we must consider the projection of the surface (Q) on one of tl 
coordinate planes. For definiteness, let us take the projection 
(Q) on the x, y - plane which we denote by (£2'). 

Since the element (the area of an infinitesimal part) of a curvil 
near surface can be regarded as being plane to within infinitesima 
of higher order of smallness relative to the area, we have 


dQ' — dQ | cos ex | = c?Q | cos (n, z) j 
where n is a normal vector to the surface. Jt follows that 


I = \ u dQ = f u ■ 

(Q) (O') I COS (II, Z) | 


( 2 ! 


The last integral taken over the plane figure (Q') is computed 1 
means of the methods of Sec. 9. 

Let the surface in question be represented by an equation of tl 
form z = f {x, y). Then, according to Sec. XII. 2, the vector 


n= — 



dy 


3 + k 


is directed along the normal to the surface at every point x, y, 
belonging to the surface. Hence (see Sec. VII. 10) we have 


cos (n, z) ; 


n-k 


'" Hk| /(-£)■+(-£)*+: 

Therefore, if integral (27) is given in the form 

j u (x, y, z ) 


I 


(Q) 


we obtain, on the basis of formula (28), the expression 


f = jj u(x,y,f (x, y)) Yi + (/'.)* + U'vf dx dy 
(£!') 

In particular, taking into account property 4 in Sec. 3 we deri 1 
the formula for the area Q of an arbitrary surface (Q): 

Q= f \ dQ=, f f V^ + W + Wdxdu 
m <Jj 

Here, as above, (Q') is the projection of the surface (Q) on the x, \ 
ana z .~ ? '■ x ’ d) I s the equation of the surface. 

When projecting a surface, we sometimes have to divide it ini 
several parts. The projection on the planes y, z or x, z and the co 
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responding computation of a surface integral are performed in a si- 
milar way when it is expedient. [Let the reader deduce the formula 

of | cos (n, z) | for a surface represented by an equation of the form 
F {x, y, z) = 0.] 

11. Integral Over a Three-Dimensional Region. Let us now con- 
sider an integral 

1= j udQ 
<«) 

where (£>) is a solid, that is a domain in space. We compute it follow- 
ing the procedure which was developed in Secs. 8 and 9 for an in- 
tegral over a plane figure. The corresponding integral sum is now 
represented as a three-fold iterated sum. In the simplest case when 
(Q) is a rectangular parallelepiped defined by the inequalities a ^ 
^ z ^ 6, c < p < d and e ^ z ^ / we obtain, after passing to 
the limit in the integral sum, the formula 

b d f 

I = j dx | dy J u ( x , y, z) dz 

ace 

that is 

b d i 

F— ^ ( j ( j u(x, y, z)dz) dy'j dx 

ace 

By the way, it is possible to perform here the integration by inver- 
ting the order of integration in five different ways because there 
are six different combinations (permutations) of the differentials 
dx, dy, dz. 

In the case of a domain of integration of a more general form the 
determination of the limits of integration will be more complicated. 
Suppose that we want to set up the limits of integration when inte- 
grating in the following order: 


I = j udQ= j dx 'jdy | u(x, y, z) dz 

(S) 


(29) 


Let the domain of integration he of the form shown in Fig. 314- 
Here the first (inner) integration is performed with respect to z 
within the domain (£2) for fixed x and y. Therefore the limits of 
this integration arejq and z 2 (see Fig. 314), i.e. (p t (x, y) and <p 2 {x, y) 
where z = ep ; (a:, y) and z = q ) 2 ( x . y) are the equations of the upper 
and lower parts of the surface bordering the solid (£2). 

After the integration with respect to z and the substitution of the 
limits of integration have been performed the result of the first 
integration depends only on x and y. Now we pass to the projection 



MULTIPLE INTEGRALS 


605 


(Q') of the solid (Q) on the x, y-plane and perform the integration 
with respect to y (the second integration). When integrating with 
respect to y within the projection (Q') we keep x fixed. Thus the 
limits of the second integration are y 4 = ij)i ( x ) and 1/2 = ^2 fi- 
gs it was described in Sec. 9. The result of the second integration 
will depend only on x. It should he integrated with respect to x 
over the maximal range of x, that 
is from a to b. This is the third, 
outer integration. Thus, after set- 
ting up the limits of integration 
we can put down integral (29) in 
the form 


=?z(x,y) 


! 


u dQ — 


(O) 

ip-CO 12(3:. 1 ) 


= | dx j Jy j u (.T, IJ, 2) dz 


\fl(.-r) <f, (.t, j/) 



y-fi(z) 


c Y=j>z(>X) 


Fig. 314 


The reader should pay attention 
to the fact that the limits of inte- 
gration in each integral depend 
only on those variables with res- 
pect to which the integration has not yet been performed. In 
particular, the limits of the outer integration cannot depend on 
the variables of integration and are constant. 

We can similarly set up the limits of integration in the other five 
possible cases of inverting the order of integration. As in Sec. 9, 
we sometimes have to divide the domain of integration into several 
parts when setting up the limits of integration if the domain (Q) 
is of a more complicated form. 


§ 4. Change of Variables in Multiple Integrals 

12. Passing to Polar Coordinates in Plane. As in the case of a 
one-dimensional integral, we can introduce different variables of 
integration when computing a double integral. Here we shall consi- 
der a typical example of computing a double integral in polar coor- 
dinates. Let us take an integral of the form 

/— j udQ 

di) 

where (Q) is a region in the x, p-plane which is depicted in Fig. 315 
If it is necessary to perform the integration in polar coordinates 
we must divide the domain into parts by means of the coordinate 
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curves of the polar coordinate system, i.e. by the lines p = const 
and cp = const (see Sec. II. 5), as it is shown in Fig. 315. Each of the 
elementary areas thus obtained can be regarded as being equal to 
a rectangle with sides dp and p dq> to within infinitesimals of higher 
order (why is it so?). Hence, we have 

dQ = p dp dcp 


Performing the summation over all the elementary areas we obtain 


I 


— j j up dp d<p 
(Q) 


where the integrand must be, of course, expressed as a function of 

p and <p. By analogy with Sec. 9, we 
set up the limits of integration and 
thus receive 

ii (<p) 



f udQ = f dip f up dp (30) 

(O) a ft (<p) 


The geometric meaning of the limits 
of integration is illustrated in Fig. 315. 

Polar coordinates are particularly 
convenient for regions whose boundary 
consists of coordinate curves of the 
polar coordinate system because in 
such cases, when setting up the limits of integration, we obtain 
constant limits not only in the outer integral but also in the inner 
one. For example, after the limits of integration are set up, an 
integral taken over the domain shown in Fig. 310 will have the 
form 


4 r 2 

J d(p J up dp 


13. Passing to Cylindrical and Spherical Coordinates. Let us 
take an integral 

I — j udQ. (31), 

<«) 


where (£2) is a domain of space. If it is necessary to perform the in- 
■v tegration in cylindrical coordinates (see Sec. X.l) we have to divide 
the domain into parts by means of the coordinate surfaces of the 
. cylindrical coordinate system, i.e. the surfaces p = const, qp = const 
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and 2 = const. Then each of the elements of volume (see Fig. 316) 
can be regarded as being equal to the volume of the rectangular 
parallelepiped with dimensions dp, p dcp and dz to within infinite- 
simals of higher order of smallness (relative to the element of volu- 
me). We suggest that the reader should verify this assertion. Con- 
sequently, we have 

dQ — dp ■ p dcp • dz — p dp dcp dz 

Therefore integral (31) takes the form 


j ^ j up dp dcp dz 
(«> 


where the limits of integration are still to be set up as in Sec. 11 
where we set the limits in Cartesian coordinates. 

If we use spherical coordinates (see Sec. X.l) the element of 
volume can be again regarded as being approximate^ equal to the 




volume of the corresponding rectangular parallelepiped (see Fig. 317). 
In this case the rectangular parallelepiped has the sides dr, r dQ 
and r sin 0 dcp and thus we have 

dQ = dr-r d9-r sin 9 dcp = r I 2 sin 9 dr dQ dcp 
Consequently, integral (31) takes the form 

I — J j j ur- sin 9 dr d0 dcp (32) 

(Q) 

The limits of integration are set up in a particularly simple manner 
m these coordinate systems (and also in other systems) when the 
boundary of the region (Q) consists of coordinate surfaces because 



G08 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


In such a case not only the limits of the outer integration are con- 
stant hut of the first and second integrations as well. 

As an example, let us consider the problem of determining the 
position of the geometrical centre of gravity of a solid having the 
form of a hemisphere of radius R. To do this we place the hemisphere 
as it is shown in Fig. 318. Then the symmetry implies that the centre 



of gravity will lie on the z-axis. Taking advantage of formula (9), 
passing to spherical coordinates by means of formula (32) and taking 
into account that z — r cos 9 we obtain the following expression: 

z Q == - -- ■■ ■ f f f r cos0 •r 2 -sin 0 dr d0 dtp — 

I*** J dV 

n 

2ji 2 R 

= | dtp ^ dQ ^ r 3 sin 0 cos0 dr — ^-R 

o o o 

(Check up the calculations!) 

14. Curvilinear Coordinates in Plane. Besides Cartesian and polar 
coordinates, we can introduce many other coordinate systems in 
plane. Their common feature is that the points of the plane are always 
characterized by two coordinates (see Sec. X.2). 

We now consider a general coordinate system k, p whose coordi- 
nate curves X = const and p = const are depicted in Fig. 319. 
If these curves are drawn sufficiently close to one another the plane 
is divided into small figures (cells) which can be regarded as paral- 
lelograms to within infinitesimals of higher order. 

Let the curves k — const he drawn with the interval dk and the 
curves p = const with the interval dp. We denote the sides of one 
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of the small parallelograms by dsx — MP and ds (l = MN (see 
Fig. 319). If we neglect the infinitesimals of higher order we can con- 
sider these sides to be directly proportional to dX and dp, i.e. 

dsx — l ). dX, ds tl = l p dp (33) 

The quantities lx and'Z^ are called Lame’s coefficients after G. Lame 
(1795-1870), a French mathematician and engineer. The coefficients 
make it possible to compute linear sizes in a curvilinear coordinate 
system. For a given coordinate system, Lame’s coefficients can have 
different values at different points of the plane in the general case. 
For instance, Fig. 319 indicates that these values are smaller in 
the lower part of the plane than in the upper (why?). If it is neces- 
sary to calculate the arc length of a finite portion of a coordinate 
curve we must integrate the corresponding relation (33). 

Let us introduce the radius-vector r = r (X, p) = OM of a va- 
riable point M of the plane drawn from a fixed point O. Then the 

sides J\IP and MN of the elementary parallelogram depicted in 
Fig. 319 are equal to 

d?.r = r? dX and d u r = dp 

to within infinitesimals of higher order since these increments of 
the radius-vector are due to the variation of only one of the coor- 
dinates. It follows that | d ? r | = | | dX. But, according to Sec. 

VII. 23, we have | dr | = ds, and therefore | S?.r | = ds?.. Hence, 
taking advantage of (33), we derive 

I?i = |rx|, and similarly ^ = J r^ | 

If besides the curvilinear coordinates X, p we introduce Cartesian 
coordinates x , y whose origin is placed at the same point 0 we shall ' 
have r = xi + yj (see Sec. VII. 9), and consequently 

h = *r*+aJ| = V (a) +{w) 

and 

/ - u d 'J i I _ i / ( dx \ z ,( 9y\ 2 

11 ~ dp. ~ spr 3 1 — k Ur] + Ur) 

Let the reader derive the formulas l p = 1 and l v = p for a polar 
coordinate system whose origin coincides with the origin of the 
Cartesian system x, y first on the basis of formulas x — p cos <p, 
y = P sin <p and then directly by taking advantage of -formulas (33)! 

The area do of any elementary parallelogram shown in Fig. 319 
is proportional both to dX and dp, i.e. 

do = k dX dp ( 34 ) 

39-0141 
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where k is a coefficient which can take on different values at diffe- 
rent points in the general case. Applying the formula of the area 
of a parallelogram we obtain 


k 


do 

d'K dp 


ds } dsp sin a 
d). dp 


lj.lv, sin a 


(35) 


where a is the angle between the corresponding coordinate curves. 

In particular, for an orthogonal coordinate system, i.e. a system 
whose coordinate curves intersect at right angles, we have 

k = (36) 


In the general case we can deduce from formulas (34) and (VII. 21) 
the formula 

I dx dy 


I d p dp 

(see Sec. IX. 13 on the last notation). In deducing the formula we 
apply the property of determinants (property 7 in Sec. VI.2) accor- 
ding to which a determinant does not change its value when being 
transposed. 

The same result can he obtained if we note that the coefficient 
k in formula (34) characterizes the change of areas under the mapping 
of the X, p-plane on the x, p-plane defined by the formulas x = 
— x (k, | x), y =■ y (?.., p). By Sec. XI. 14, this coefficient is equal 
to the absolute value of the corresponding Jacobian, i.e. of the 

determinant entering into (37). 

[Let the reader deduce the formula k = p for a polar coordinate 
system taking advantage of formula (37). Do the same on the basis 
of formula (36). Find Lame’s coefficients and the coefficient k for 
a Cartesian coordinate system.) 

If we take an integral of the form 



(o) 


taken over a finite plane region (a) we obtain, on the basis of for- 
mula (34), the expression 

/ = J J ukdX dji (38) 

(<j) 

where the limits of integration must be set up by analogy with 
integrals (23), (24) and (30). The simplest case for setting up the 
limits of integration is when the region in question is bounded by 


D (*, y) 


D (?., p) 


(37) 
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a contour consisting of arcs of the corresponding coordinate curves 
(why is it so?). [Let the reader verify that formula (38) turns into 
formula (30) in the case of polar coordinates.] 

15. Curvilinear Coordinates in Space. General curvilinear coor- 
dinates X, p, v in space are considered in a similar way. The surfaces 
X = const, p = const and v = const form three families of coor- 
dinate surfaces whose intersections generate three families of coor- 
dinate curves. The coordinate surfaces corresponding to the values 
X, X -f dX; p, [i + dp and v, v -j- dv of the coordinates bound an 
elementary volume (cell) in space which can be regarded as a paral- 
lelepiped to within infinitesimals of higher order (the parallelepi- 
ped can be oblique in the general case). Such parallelepipeds are 
shown in Figs. 316 and 317 for the concrete cases of cylindrical and 
spherical coordinates (in these cases the parallelepipeds are rectan- 
gular). The length of one of the edges of the infinitesimal paralle- 
lepiped is equal to 

ds k = | d k r | = | r' x | dX = h dX 
(to within infinitesimals of higher order) where 

=/(#)*+ (fr+t-fc) 2 


is a Lame coefficient. The lengths of the other two edges of the pa- 
rallelepiped are expressed similarly. The volume of the parallele- 
piped is equal to dfi = k dX dp dv where lc is a coefficient which 
can take on different values at different points in space. Therefore 
the corresponding change of variables in a triple integral is per- 
formed according to the formula 

j udQ,= ^ j [ [uk dXjdp dv (39) 

(O) * (V 

In the case of an orthogonal coordinate system we have k = 

In the general case the coefficient k can be found on the basis of the 
geometric meaning of a triple scalar product of vectors (see Sec. 
VII. 15): 


dQ 

dh du. dv 


= lttXr').r'H “ -g. * 


dX dp dv 



dx 

dy 

dz 

dX 

ex 

dX 

dx 

dy 

dz 

dp 

dp 

dp 

dx 

dy 

dz 

dv 

dv 

dv 


dX dp dv 


i_ D{x, y, z) 
D (X, p, v) 
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(Let the reader calculate the coefficients Z;., 1 Z v and k for Car- 
tesian, cylindrical and spherical coordinates.) 

16. Coordinates on a Surface. It is possible to introduce a coor- 
dinate system on an arbitrary surface (see Sec. X.6 and Fig. 211). 
Let us denote the coordinates by the letters X and p. After a manner 
of Sec. 14, we can express, to within infinitesimals of higher order, 
the sides and the area of an infinitesimal parallelogram bounded 
by the coordinate curves corresponding to the values X, X -f dl 
and p, p -f dp of the coordinates. Indeed, 


ds x — | 3? r ) = Z;, dX 


where 


i>.= \A\=y r (-§-) +($-) + 


(ds IL is expressed similarly). Thus, we have da = k dX d\i where 
for an orthogonal coordinate system and 


k = | r). x r; | = j 


1 /" / dy dz dz dy \ 2 . / dx~ dz~ dz~ dx\ 2 , / dx dy dy dx \ 

= V ajl ~ lx la ) ~^\dl dp dii) ''{HTTi T — aTauj 


dx 

d’J 

dz 

dl 

dX 

dl 

dx 

dy 

dz 


dj_l 

dn 


dz 

dz 

' I'aT 

0p 

dl 


in the general case. The transformation of a surface integral to the 
variables X and p is performed according to formula (38). 

As an example, let us consider the surface of a sphere of fixed 
radius R- The spherical coordinates <p, 8 considered on the sphere 
are the example of curvilinear coordinates on a surface. These coor- 
dinates are expressed as ordinary spherical coordinates in space 
with a fixed value r = R of the radius. This is an orthogonal system, 
and Fig. 317 directly implies that 

dsy = R sin 0 <Ap, ds e = R dQ 
i.e. 

Iq — R sin 8, l 0 = R 

The coefficient k is expressed as k = l^l e — R 2 sin 0 in this case. 
Consequently, an integral taken over a portion (cr) of the sphere 
(which can coincide with the whole sphere) is computed by the 
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formula 


f u da = R~ j ^ tt sin 0 dq> dd 
(a) (°) 


(40) 


As an example, let us determine the force of attraction between, 
a material point of mass m and the surface ( 0 ) of the whole material 
sphere of radius R with a constant surface density of mass p. Because 
of the symmetry, we can, without loss of 
generality, limit ourselves to the disposi- 
tion shown in Fig. 320. Every surface 
element da attracts the mass m with the 
force dF which can be found on the basis 
of Newton’s law of gravitation: 


\d¥\ 


jn p d(J 


fmp 


/}2 a. ifi — 2Rh cos 0 


da (41) 



where k is the constant of gravitation. 

When summing up these elementary for- 
ces we must add together the projections of 
the forces on the coordinate axes but not 
the absolute values of the forces because 
they have different directions. The sym- 
metry implies that the resultant force is 

directed along the z-axis and therefore we have to add together 
the projections of all the elementary forces on the z-axis: 


Fig. 320 


j (dF) z = | | dF j cos a = 

(O') (o) 

_ [ x . .. m P da r " + - /l — - 

— J x /?2 + / l 2_2/{/ i .cos0 2 hi 

00 

(the expression of cos a has been found on the basis of the cosine 
law which can be written as R 2 = l 2 - f- h z — 2 Ih cos a in this case). 

Substituting l = Y R 2 + h 2 — 2 Rh cos a into the last integral 
and passing to spherical coordinates according to formula (40) 
we obtain 

„ f h — itcosO , 

F — xmp \ 3*d0 = 

fa (R r i -f _ 2 Rh cos 9) 2 

= y.mp ^ dd | — — — c ° s 8 g-i? 2 sin 0 dtp = 

*° *' {R 2 -j- h 2 — 2Rh cos 0) 2 
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• sin 0 dQ 


- sv. mp | — _ s ,» e 

0 (fi 2j rh- — 2Rh cos 0) 2 
ji 

- ~2nR 2 y.mp [ h-RcosQ 

0 (R 2 -f- h 2 — 2Rh cos 0)^ 




dCOS0: 


= 2ziR 2 y.mp f h —B. _ dt 

- 1 (R 2 + h 2 -2Rhtf 

taafifssr <*><». -m*. 


1 A-AIA + 
= 2nR 2 y.mp 1 — 

n-rh 

n-rh 

J (lH~! 

|R— A| 


V — R2_,fl 

2h 


/ 2ldl\ 
l 2 Rh ) ~ 


'-)\dl = 


\R — h\ 


sfon h\ti! fnV' \ I — h — R. Substituting this expres- 

mto formula (42) we obtain (check it up!) the formula 


( 1 

1 ^ 


R+h) 


J? m , "•"*** N t HIT 1 _ , 

where M is the total mass of the sphere. lih<R we have | R - h | = 
— n — h , and thus we similarly find 

F = 0 (A < i?) 

oulsidTit Sphere attracts material points lying 

its centre’ and d 6 °l a * Jf 333 ^e s Phere were concentrated at 
consider a mJfvriiT Ta at , trac . t P. oints lying inside it. Now let us 
metric miss distwb S t°- 1C ^ °rl , s Pkerical form with a spherically sym- 
distance from tbp^f 0 ^ c^u 1S density depends only on the 
tino- of snh prim 1 i 6n 16 'r®. u< (J 1 , a s °lid can be thought of as consis- 
ted spheres Farl^f 613 in ^ n ltesimal width bounded by concen- 
which the above ayer c S n h e regarded as a material surface to 
The spherical solid an U f" 1 he . applied ’ Thus ’ ™ conclude that 
mass were concenta/lf^d P ° lnt ^ngoutside it as if the whole 
the solid is attract a cen . tre - Similarly, a point lying inside 

“ 'tL a «re d r„ y t b /e p„ e in P t 0rti “ ° £ “ “« S 


minR 2 p 
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§ 5. Other Types of Multiple Integrals 

17. Improper Integrals. The theory of improper multiple integrals 
is similar to that of one-dimensional integrals (see § XIV.4). We 
begin with the integral 

/= j udQ (43) 

(«> 

in which the integrand u is a finite function and the domain of 
integration is unbounded (infinite). It is defined as the limit 


j udQ 

(Q) 


lim f u dQ.' 


(44) 


where the domain (Q') on the right-hand side is finite. This domain 
expands in the limiting process and exhausts the whole domain (Q) 
in the limit (see Fig. 321). If limit (44) exists, is finite and inde- 
pendent of a particular way in which 
the domain (Q') expands integral (43) is 
said to be convergent. If otherwise the 
integral is referred to as being divergent. 

If limit (44) equals infinity we write 


Boundary of 
the domain ISl) 


u dQ = oo. If u 0 


integral 


(44). 


5 
(£ 2 ) 

either converges or diverges and J u dQ, = 



Boundary of 
the domain (SI) 


— -f°o (i.e. it is divergent to infinity). Fig. 321 

In such a case we can set up the limits 

of integration in any coordinate system (convenient for the com- 
putation) by the rules given in §§ 3, 4. If the result of the sub- 
stitution of the limits is finite the integral is convergent and if 
otherwise the integral is divergent. The comparison tests [see 
(XIV.49) and (XIV. 50)] remain valid in this case. In applying the 
comparison tests we can use integrals (XIV. 51) and some other 
integrals. For instance, in the case when (Q) is the whole plane we 
often perfor m the c omparison with an integral of the function r~ v 
where r =y x 2 -f- y~ is the length of the radius-vector. It is only 
the behaviour of the integrand for large values of r that is essen- 
tial for the convergence of an integral of this type, and therefore, 
we must investigate the integral 


2tt 03 


j j r ~ v dx dy = \ di p f r~ p r dr = 2k ( ~dr (r 0 >0) 

( r > r o) 0 r 0 ri r 
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(we have passed to polar coordinates here). According to Sec. XIV.15 
(see formula (XIV.51)J, the integral is finite for p >2 and infinite 
for p ^ 2. Similarly, in the three-dimensional space x , y, z the 
integral of r~ p — (]/ x 2 + y 2 + z ~) ~ p (taken over the whole space 
with a sphere of radius r 0 > 0 and centre at the origin of coordinates 
cut out) converges only for p >3. 

As an example of an application of improper multiple integrals, 
let us deduce formula (XIV. 70). We take the integral 

OO CO 

7 = J J X p - X VdxdlJ {p > 0, q > 0) 

U 0 

whose domain of integration is the first quadrant of the x, p-plane. 
The integrand being a positive function, we can perform the inte- 
gration in any order. This yields the following results: 

OO CO 

(1) /= | dy j z p-I p p +9-ie - ( :c + 1) u dx = 

o o 

r , r /s\p-i , i^-('w + 1 ) yds 

= \ dlJ ) ( 7 ) y p+9 ~ ie v j 

0 0 


(we have made the substitution x = -^) • Calculating we find: 

00 co 

/= j s p ~ 1 e~ s ds • i| yi~ l e~y dy — T (p) T (q) 

0 0 

OO OO 

(2) /= j dx j v dy — 

0 0 

00 00 

= J & l xP_ 1 (TTT) P+?_le ^TTI 


(we have employed the change of variable y — . Now, ap- 

plying formula (XIV. 73) we deduce: 


w 00 

/= J dX ' j tP+9 ~ le ~‘ dt — B (p, q)V(p + q) 


0 ' • ■ 0 

Comparing the results we derive the desired formula. 

If the function u takes on the values of both signs and 


j j u | dQ 


< co 


(U) 


(45) 
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integral (43) converges and is said to be absolutely convergent. 
In evaluating such an integral we can choose any convenient coor- 
dinate system and set up the limits of integration. It can also he 
proved that if condition (45) is violated integral (43) is divergent. 
In this case limit (44) can depend on the way in which the domain 
(Q') expands. Then it can happen that after setting up the limits 
of integration and evaluating the integral we obtain a finite result 
in one coordinate system, an infinite result in some other system, 
a finite result but different from the first one in a third system and 
a divergence of an oscillating type in a fourth coordinate system 
(see Sec. XIV.14). Hence, in such a case the possibility of changing 
variables and inverting the order of integration should be additional- 
ly investigated. There are no such problems in evaluating absolutely 
convergent integrals. 

Improper multiple integrals of other types are treated similarly. 
Namely, if the domain of integration contains a point, a curve or 
a surface on which the integrand approaches infinity, the singula- 
rity (i.e. the point, the curve or the surface) is cut out of the domain 
together with its neighbourhood and then the boundary of the cut- 
out portion is contracted to the singularity in an arbitrary way. 
The limit thus obtained is taken as the value of the improper integral 
provided this limit is finite and independent of the way of the con- 
traction. If an improper integral of this type of a function which 
is positive everywhere or positive near its singularities converges 
the integration can be performed in any coordinate system. The 
same can be done for arbitrary functions if they are absolutely 
integrable. When investigating an integral whose integrand has an 
isolated singularity, that is a separate point at which the integrand 
approaches infinity, we often apply the comparison test using the 


integral j ^ r dx dy in plane and the integral j j j r~ v dx dy dz 
r<r 0 r<r 0 

in space. We can easily verify that the former converges only for 
P < 2 and the latter for p < 3. If there is a non-isolated singularity 
then in investigating the convergence it is convenient to choose a 
coordinate system in such a way that the singularity should coin- 
cide with one of the coordinate curves or surfaces. 

18. Integrals Dependent on a Parameter. Let us consider integrals 
of the form 


/ (a) = j / (M, X) dQ 
(si) 

where M is a variable point in the domain (Q) whose coordinates 
are thfe variables of integration and X is a parameter which is kept 
constant in the process of integration. The theory of such integrals 
is developed by analogy with Sec. XIV. 5. All the basic assertions 
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proved there remain valid here. There is a difficulty in this case 
•which arises when the domain of integration also depends on the 
parameter. In such cases we often try to change the variables so 
that the new domain of integration should become fixed. But, of 
course, such integrals can also be investigated directly. 

For instance, let us consider a triple integral of the form 

j{J nM)d9. 

<fp ? , Cx, V,z)^ 0) 

where the function cp?. ( x , y, z ) depends on the parameter X and the 
domain of integration is a region in which cp>. ^ 0. Let it be neces- 
sary to compute the derivative The expression dl coincides, 

to within infinitesimals of higher order, with the difference I (X -f- 
-f dX) — / (/.) which is equal to an integral of / (M) taken over 
a thin layer bounded by the surfaces (£>.) and (S>. +d j.) with the 
equations q> ? , = 0 and cpj._L rf >. = 0, respectively. Let us take a point 
A on (S-,) and draw the normal to ( S ,) at A. We reckon the distances 
from A along the normal and consider them positive in the direction 
to the region where <p* >0, that is in the direction of outer normal 
to the boundary surface of the domain of integration. We now de- 
signate the point of intersection of the normal with the surface 
(S;._<a) by A. Then the quantity dn — A A is equal to the width 
of the layer at the point A. We have cp ? , (A) = 0 because the point 
A belongs to (S>). We also have e (A) — 0 (why?). Now we can 
write the relation 

(A) — q> ? . (A) -f dX = <p>, (A) -f | grad | dn + dX 

which is accurate to within infinitesimals of higher order. It follows 
that 

5q> 

| grad rp,. | dn ~ ■— dX = 0, i.e. dn = — — ~ — r dX 

dhe element of volume of the layer can be written as di 2 = dS dn 
where dS is the surface element of (■??.)• Consequently, we have 

T dtp 

d° = dS dn = , dS dX 

i grad <p | 

The quantity dn can be positive or negative here. Its sign indicates 
whether the volume of the layer is added to the volume of the ori- 
ginal domain (corresponding to the value X of the parameter) or sub- 
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tracted from it. It follows that 

dtp 

<<p x =0) 

An integral can depend on several parameters. The coordinates 
of a point N varying within a certain region can play the role of 
such parameters. An integral of this type is written as 

I(N)= j / (M, N) dQ M 
(«) 

where the symbol dO. M indicates the fact that it is the moving point 
M whose coordinates are the variables of integration (see Sec. 2). 
The point N which can occupy different positions is kept fixed in 
the process of integration. Its coordinates are the parameters. The 
basic properties of integrals dependent on a parameter (see Sec. 
XIV.5) are easily extended to these integrals. 

An integral can be integrated with respect to a parameter on 
which it depends. This results in a multiple integral of higher order. 
For example, let us consider the problem of calculating the force 
F of attraction between two material bodies (Qj) and (Q^ whose 
densities p t and p 2 can be variable in the general case.- Let us take 
two elements of volume dQi and dQ 2 placed at some points M 1 and 
M z belonging to (Q t ) and (Q 2 ), respectively. The force with which 
the element attracts the element dQ 2 can be expressed on the 
basis of Newton’s law of gravitation: 

d dF = x (M 2 M,)° = x dQ 1 

I I 2 I mJTi |3 

> -> 

where (. M z Mi , )° is the unit vector in the direction of the vector M z Mi- 
Now, integrating over (Qj), we obtain the force with which the whole 
body (Q t ) attracts the element dQ 2 : 

dF = x( \ d Q\ p2 cI q 2 

Vi) |M 2 M t |3 ’ 

In the above integral the integration is performed with respect 
to the coordinates of the variable point running throughout 
the domain (Q t ) when the point M 2 is arbitrarily fixed in the domain 
(tl 2 ), the coordinates of M 2 being parameters. To obtain the resul- 
tant force of attraction we must additionally integrate with respect 


r dS dX 


i.e. 


dl 

d% 
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to the coordinates of M z : 

F —y. f p 2 dQ 2 f dQ, 

(&) (Qi) I M^Mi | 3 


If we set up the limits of integration we shall obtain a six-fold 
iterated integral. For instance, in Cartesian coordinates we shall 
have 

F = x. j j | p 2 (*r 2 , >h, z 2 ) dx 2 dij z dz z x 
(£ 22 )” 


X JJj Pi (fn j Mxjij Jjfl ~ fjl 1 ± M ~ M j ± 111 T-igliil ^ <Zz, 

[ (^, - ar,)2 + (y, - y ,)2 -Hz 2 - :i )2] 3 /"- 


<Oi) 


The limits of integration which are to be set up in the above integral 
depend on the form of the domains (Qj) and (Q 2 ). 

19. Integrals with Respect to Measure. Generalized Functions. 
For definiteness, let us consider triple integrals. In Sec. 1, the volume 
of a domain in space was called its measure. But this is only the 
simplest example of a measure which is referred to as the Lebesgue 
measure (after the French mathematician H. Lebesgue, 1875-1941, 
who developed the general theory of the measure). It is also possible 
to introduce other measures with respect to which integration can 
be performed. 

We begin with an example. Let a mass m be distributed in space 
(see Sec. 6). By analogy with Sec. 18, we can readily deduce the 
expression 


F = y.m 0 


\ 



| NM |3 


(46) 


of the force with which the distributed mass m acts upon a mass 
point m 0 placed at the point N. Here M is a variable point in space 
whose coordinates are the variables of integration and the integra- 
tion extends over the whole region of space in which the mass in 
is distributed. 

If the mass m is distributed continuously, that is if the mass of 
every surface, curve or point is equal to zero, we can introduce the 
density p (see Sec. 6). and then integral (46) can be reduced to an 
ordinary triple integral 


f JgT p dO (47) 

J 1 NM p 

But we sometimes deal with a mass which is concentrated on 
separate surfaces, curves or at points. Then the ordinary notion of 
density cannot be applied and hence it is impossible to pass to 
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integral (47). In such cases we have to consider integral (46) as an 
integral with respect to the measure m. 

The general definition of a measure in space is analogous to the 
basic definition given in Sec. 7. To every part (Q) of space which 
can he mentally isolated from it (i.e. to each solid, surface, curve 
or point etc.), a certain value p ( Q) of the measure p must correspond. 
The condition of additivity is also imposed on p. The measure of 
a surface, curve or point may not be equal to zero in the general 
case. We usually deal with non-negative measures p >- 0 but it 
is sometimes expedient to consider signed measures (also called 
charges ) which can assume values of either sign. In such cases the 
measure can he thought of not as a mass but as an electric charge 
which can he positive or negative. A measure can be defined not 
only in space but also on a surface or curve. 

The definition of an integral with respect to measure is similar 
to the ordinary definition given in Sec. 2 (integrals with respect 
-to measure are also called the Stieltjes integrals). Let us take a do- 
main (Q) in space. If a measure p and a function u (M) [where M 
is a variable point running over (Q)] are defined in the domain (Q) 
the integral of u with respect to the measure p is defined as 

n 

j u dp = lim 2 it (M k) P(An ft) (48) 

(S3) fc=i 

where the meaning of the notation is obvious. Such ail integral 
always exists provided the function u is finite in (Q) and the measure 
of the whole domain (fi) is also finite (if p is not non-negative we 

must additionally impose the condition that j | dp | < oo which 

(Q) 

means that the positive and negative variations (parts) of p should 
also he finite). If the function u is discontinuous the form of inte- 
gral sums used in (48) must also be specified but we shall not dis- 
cuss this question at length here. Improper integrals with respect 
to measure are defined after a manner of Sec. 17. The properties 
of integral (48) are similar to those discussed in Sec. 3. When we 
speak about the properties related to integrating inequalities we 
must additionally impose the condition p ^ 0. 

If the measure of every surface, curve or point is equal to zero 
it is possible to pass to an ordinary triple integral taken over a 
volume: 

5 J u%d.Sl= J updQ (p = -|-) (49) 

(U) (U) (£2) 

This transition can be performed for any measure but in the general 
case p will be a generalized function. 
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The simplest generalized function in space is the delta function 
6 (x - a) 6 (y - b) 8 (z - c) (50) 


(see Sec. XIV.25) which describes the volume density of unit mass 
placed at the point (a, b, c). The function 8 {y — b) 6 (z — c) des- 
cribes the volume density of a mass uniformly distributed along 
the straight line y = b, z = c with unit linear density. The func- 
tion 8 (z — c) is the volume density .of a mass uniformly distributed 
over the plane z = c with unit areal (surface) density. Using these 
functions and some other generalized functions (in particular, delta 
functions depending on curvilinear coordinates) we can perform 
transformation (49) in the general case. 

The properties of the generalized functions of several variables 
are similar to those of functions of one variable (see Sec. XIV. 27). 
Generalized function (50) can be applied to constructing an influence 
function (Green’s function; see Sec. XIV.26) of the form 

G (M, N) = G ( x , y, z, c, r), £) 

where (x, y, z) are the coordinates of the point M (point of obser- 
vation) and (|. r\, t) are the coordinates of the point N at which 
the source (producing the corresponding action) is placed. When 
investigating processes developing in time we also use the delta 
function 

8 (x — a) 8 {y — b) 6 (z — c) 8 {t — t) 

which yields an influence function of the form G {. M . t, N, t). 

20. Multiple Integrals of Higher Order. A measure can also be 
defined in a ^-dimensional space or, as we say, in a k-dimensional 
manifold (see Sec. X.2). The definition of an integral of form (48) 
and its basic properties remain unchanged in this case. To pass to a 
repealed integral we must introduce generalized coordinates t { . t z . ... 

. . ., t h in the manifold (see Sec. X.2), express the integrand as a 
function of the form u = u (t u .... t h ) and find the density 
p (ij, . . ., tf,) which defines the element of measure dy. = p (#i. . ■ . 
.... th) dt j dt z . . . dtk corresponding to an infinitesimal genera- 
lized /c-dimensional parallelepiped placed at the variable point 
(I., . . ., th) and bounded by the corresponding “coordinate surfa- 
ces” [which are ( k — l)-dimensional submanifolds in the generaL 
case). Then integral (48) takes the form 


J « = j j • • • J u fa, h, 

m> —tt.- — 


h times 


* . . , tk) P (fu i ik) . dtk (51) 


where the limits of integration must he set up on the right-hand 
side according to the ranges of variation of the coordinates t { , . . ., tk- 
The density p entering in formula (51) is understood as an ordi- 
nary function if the measure, of every submanifold of dimension. 
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s <k (which can be defined by means of one or more equations 
connecting the coordinates t u . . t k ) is equal to zero. In particu- 
lar, this is the case if the density is finite everywhere. 

If otherwise, p should be understood as a generalized function 
(see Sec. 19). 

If the notion of a volume (hypervolume) is introduced in the 
space under consideration we can perform the integration with 
respect to the volume which is a particular case of a measure. To 
do this we must know the expression 

dQ = h ( <!, . . ., t h ) dti dt z . . . dt k (52) 

of the volume of an infinitesimal generalized parallelepiped bounded 
by the corresponding coordinate surfaces (submanifolds). Then the 

integral ^ u dQ can be transformed by analogy with (51). 

(B) 

The notion of an integral (with respect to a measure or hyper- 
volume) over a domain belonging to any submanifold of lower di- 
mension lying in the initial ^-dimensional manifold is introduced 
in a similar way. In the ordinary three-dimensional space we can 
consider line integrals, surface integrals and triple integrals but 
in a fc-dimensional space there are k different types of integral (what 
are these types?). 

For the /c-dimensional Cartesian space Eh (see Sec. VII. 18) we 
put h = 1 in formula (52), i.e. we take the volume of unit /e-dimen- 
sional hypercube with unit sides as the unit measure of hypervolume. 
Integrals of lower order in this space are defined under the conven- 
tion that the p-dimensional volume (l^j)< k) of a p-dimensional 
rectangular parallelepiped (finite or infinitesimal) is equal to the 
product of the lengths of its sides (this is the Lebesgue measure). 

By analogy with Sec. XIV.23, we can consider integrals with 
respect to coordinates taken over a p-dimensional manifold ( S ) 
(1 ^ P < k) lying in Eh- But ( S ) must, be orientable in this case. 
This is a new notion which cannot be easil y visualized for p > 1 1 
and we are going to discuss it at length here. 

First of all we shall introduce the notion of a /^-dimensional tetra- 
hedron. By definition, a one-dimensional tetrahedron is a line seg- 
ment, a two-dimensional tetrahedron is a triangle and a three- 
dimensional one is a triangular pyramid.' To obtain a four-dimen- 
sional tetrahedron we take a three-dimensional tetrahedron lying 
■in a three-dimensional space which is considered to be a subspace 
of some space of dimension s >3. Then we take a point belonging 
to the s-dimensional space which does not belong to the three-dimen- 
sional subspace and connect this point by line segments with all 
the points of the three-dimensional' tetrahedron. The totality of 
all the points belonging to these line segments'is a four-dimensional 
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tetrahedron. The tetrahedrons of higher dimensions are constructed 
similarly. Now take a p-dimensional tetrahedron with vertices 
A t . A 2 - • • A p + 1 - Its orientation is specified by enumerating these 
vertices in a certain order. It is assumed that the permutation of 
any two vertices changes the orientation to the opposite one. For 
instance, if we take a three-dimensional tetrahedron with vertices 
A. B, C, D the combinations ABCD and DBAC define the same 
orientation whereas the combination CBAD defines the opposite 
orientation. Every tetrahedron can be oriented in two different 
ways. 

If we take an arbitrary small p-dimensional tetrahedron on a 
/^-dimensional manifold ( S ), choose a certain orientation for it 
and then -make it run over the manifold, the original orientation 

of the tetrahedron will induce the 
orientation of all small p-dimensional 
tetrahedrons on ( S ). Then we say 
that ( S ) has been oriented. For p = 1 
a manifold of the type (5) is a curve, 
and the above method of orientation 
is equivalent to specifying a certain 
direction on it. For p = 2 such a ma- 
nifold ( S ) is a two-dimensional sur- 
face, and its orientation is equivalent 
to specifying the direction of describing the contour of any small 
region on ( S ). If ( S ) consists of several disjoint portions they can 
be oriented independently. 

It should be taken into account that in the case p ^>2 some mani- 
folds cannot be oriented. The so-called Mobius strip (see Fig. 322) 
discovered in 1858 by the German geometer A. F. Mobius (1790- 
1868) is the simplest example of a non-orientable two-dimensional 
surface. 

A p-fold multiple integral with respect to coordinates taken over 
an oriented p-dimensional manifold ( S ) lying in E w is defined as 



. . . ^ u (li, • . . , th) dtra\ dim? • • - dt mp — lim u (Hf/ £ ) AS k (53) 

( s > fc= l 


where the summation on the right-hand side is extended over all 
small tetrahedrons (A S h ) into which ( S ) is divided, the orientation 
of the tetrahedrons being coherent with the orientation of ( S ). 
The symbol ANj, (k — 1, . . ., n) designates the /^-dimensional 
volume of the projection (AS?,) of the tetrahedron (A S h ) on the hyper- 
plane with the coordinates t^, t m2 , . . ., t m taken with the sign + 
or — depending on whether the orientation of the projection A Sh 
' (which is also a tetrahedron) coincides with the orientation of the 
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tetrahedron OOmfim*' • -Cm p where Cj (/ 1? • • m ^ 

point lying on the ^-axis whose distance from the origin of coordi- 
nates is equal to unity. All the indices mi, m z , . . . , m p are suppo- 
sed to be different, if otherwise integral (53) is considered, by defi- 
nition, to be equal to zero. 

The properties of integral (53) are analogous to the properties 
of the integrals investigated in Sec. 3 with the exception of those 
related to the integration of inequalities. If the orientation of ( S ) 
is changed or if any two differentials under the integral sign are 
permuted the integral is multiplied by — 1 (why?). We also consider 
the sums of integrals of the form 

k 

. J -, \[ 2 (t\ , • . . 5 tfi) dt mi dt m2 . . . dt mp (54) 

(S) * mi, .... m p =l 

An integral taken over an ordinary two-dimensional oriented 
surface lying in the ordinary three-dimensional space with the coor- 
dinates x, y, z which can be written in the form 

j [ P {x, y,[z) dxdy + Q (x, y , z) dydz + R {x, y, z) dx dz 
(S) 


is an example of integral (54). 

When setting up the limits of integration in integral (53) we can 
express all the coordinates tj different from t mi , t m ., .... tm p 
as functions of t mi , t m2 , t mp for the points of the manifold ( S ). Then 
we substitute these expressions into the integrand which thus be- 
comes a function of t mv t„ h , . . t m , divide ( S ) into parts whose 
projections on the hyperplane with the coordinates < mi , t mi , . . ., t m 
are of the same orientation and then set up the limits of integration 
in the integrals over each projection. The latter integrals are evalua- 
ted as ordinary p-fold multiple integrals taken over'a p-dimensional 
domain. We can also introduce some convenient curvilinear coor- 
dinates . . ., s p on ( S ) and then pass from the differentials 
dtm i, . . ., dt mp to the differentials ds u . . ., ds p . To do this we 
substitute the expression 


D (f mi , . . ., t m? ) 

D ($1, . . • , Sp) 


dsi 


ds p 


for dt mi . . . dt„{ p under the sign of integration' and set up the cor- 
responding limits of integration. ' 


40-0141 
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§ 6. Vector Field 

Multiple integrals are directly applied to the theory of vector 
field. Here we shall discuss some of the applications. The reader 
should recall the definition of a field given in Sec. IX. 9 before pro- 
ceeding to study the subject. 

21. Vector Lines. We say that there is a vector field A (field 
of vector A) defined in space if the value of the vector quantity A 
is specified at each point M of space, i.e. A = A (M). We shall deal 
with a stationary field which does not change as time passes. If 
such a variation takes place we shall consider the field at a fixed 
moment of time and thus reduce our considerations to a stationary 
field. As examples of vector fields, we can consider the field of velo- 
city v, the field of momentum density pv (where p is the density 
of mass distribution) for a flow of a liquid or gas, the field of force F, 
the electric field E (where E is the electric field strength) etc. 

A curve (L) which is tangent to the vector A at each point is 
called a vector line. In other words, this is a curve whose direction 
(i.e. the direction of its tangent) coincides with the direction of the 
field at each point belonging to the curve. Depending on the physi- 
cal meaning of the field in question xve speak about a stream line 
( flow line) of a field of velocity, a line of force of a field of force and 
so on. (Let the reader think why the stream lines coincide with the 
trajectories of the particles of liquid only in the case of a stationary 
field.) 

From the geometrical point of view the problem of constructing 
vector lines of a given vector field is equivalent to that of construc- 
ting integral curves for a given direction field (see Sec. XV.12). The 
problem is therefore reduced to integrating the corresponding sys- 
tem of differential equations. For this purpose it is necessary to 
introduce a coordinate system in space. For instance, if we take 
Cartesian coordinates x, y, z the vector A can be resolved according 
to the formula 

A — A (x, y , z) = A x (x, y, z) i -j- A v (x, y, z) j -f- 

+ A z (x, y, z) k (55) 

On the basis of Sec. XV.12, we can put down the symmetric system 
of differential equations for the vector lines of the field A 

dx dy dz 

A x (x, y, z) ~~ Ay ( x , y, z) ~ A z (x, y, z) 

v [compare this with equation (XV. 66)]. In the case of a plane 
field (see Sec. IX. 9) the system turns into the equation 

' dx dy 

Ax(x, y) ~ Ay (x, y) 
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As it was shown in Chapter XV where the theory of differential 
equations was studied, there is only one vector line passing through 
a non-singular point. Thus, the whole region in which a vector field 
is defined is filled with vector lines of the field. In a sufficiently 
small domain containing a non-singular 
point the totality of the vector lines re- 
sembles the set of parallel segments which 
can be curved a little. In the vicinity of 
a singular point the family of vector lines 
can have a very complicated structure 
(see Fig. 290 which represents some examp- 
les of this kind). 

22. The Flux of a Vector Through a Sur- Fi S- 323 

face. Let a vector field be defined in a do- 
main of space and let an oriented surface (a) (which can be closed or 
non-closed) lie in the domain. We remind the reader that orienting 
a surface is equivalent to indicating its outer and inner sides 
(see Sec. VII. 11). The flux of a vector field A through a surface (a) 
is the surface integral 

Q= j A n da 

(O) 

where A n is the projection of the vector A on the unit outer normal 
n to (a). Using the notion of the vector of an area (see Sec. VII. 11) 
and the properties of a scalar product of vectors (see Sec. VII. 2) we 
can rewrite the expression of the flux in the form 

Q — j A cos ada = j A • do 
(®) (a) 

(see Fig. 323). 

If the vector field A is represented in form (55) we can apply 
the transformation 



A ■da — A*n da = (A a i + Aj/j-}- A z k) *(cos (n, x ) i + cos (n, y) j 4- 
+ cos (n, z) k) da = A X cos (n, x) da-\-A y cos (n, y) da- {- 
-\-A z cos (n, z ) da 

U 


to calculating the flux. Thus, the integral j A- do is represented 

(o) 

as the sum of three integrals. The first integral j A x cos (n^x) da 


(o) 


can be computed if we use the relation cos (n^) do = ± da where 
au* entering into the right-hand side is the surface element^ of the 
projection (o x ) of the surface (o) on the y, z-plane and the sign is 

40 * 
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buted in space (recall the general concept of a quantity distributed 
in space which was discussed in. § 2). We first turn to the latter case. 

Here we can introduce not only the average density -2 of the source 

in (£2) [as before, the symbol Q designates the volume of the domain 
(Q)] but also the density of the sources of the field at any point M 
of space which is defined as 


lim 

(AO)-v.V 


A Q 
AQ 


lim 

(AO)-*Af 


f A-du 

(Ag) 

AQ 


(56) 


where (AQ) is a small volume enveloping the point M and (Ao) 
is the surface which hounds (AQ). 

This density is called the divergence of the vector field A and is 
designated as div A. Hence, we can say that the divergence of a 
vector field can be interpreted as the number of vector lines genera- 
ted in the interior of an infinitesimal volume (i.e. the flux of the 
field through the surface bounding the volume) related to unit 
volume. It should be noted that the divergence of a vector field 
is a scalar quantity. Moreover, it forms a scalar field because it 
assumes a certain numerical value at each point in space. 

Formula (56) can be rewritten in the form 

divA = — , i.e. d@ = divAdQ 


The last expression represents the number of vector lines issued 
from the element of volume (dQ). Summing together these expres- 
sions over a domain (Q) (see Sec. 4) we arrive at the formula for 
the number of vector lines coming out of the finite volume (Q) (that 
is for the flux of the field A): 


<^)A-dff= j divAdQ (57) 

(g) (£2) 

where (Q) is any finite domain and (u) is its boundary surface. This 
is Ostrogradsty’s formula which plays an important role in the vector 
field theory. It was discovered by Ostrogradsky (in scalar form) 
in 1826. The formula holds in all cases when the field A and its 
divergence div A do not approach infinity in (Q). It is also valid 
when the divergence approaches infinity in such a way that the 
integral on the right-hand side of formula (57) is convergent. 

The physical meaning of the divergence of a field depends on the 
nature of the vector field A. For instance, by Sec. 22, for the velocity 
field v of a gas flow div v is equal to the rate of relative expansion 
of an infinitesimal volume of gas and div (pv) is equal to the den- 
sity of mass sources. If the mass of the gas remains constant in the 
process of its flow we must have div (pv) = 0 (in the general case 
the mass can receive an increment, positive or negative, resulting 
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from a chemical or some other reaction in which the mass can chan- 
ge). At the same time we can have div v > 0, div v < 0 or div v = 
= 0 depending on whether the gas expands, contracts or does not 
change its density in the process of flow. If we take an electric field 
E its divergence, i.e. div E, is proportional to the density of a charge 
distributed in space and so on. 

If a field has sources distributed over curves or surfaces (its volu- 
me density must be discontinuous in such a case) we can speak about 
the line density or the surface density. In such a case we must add 
to the right-hand side of formula (57) the corresponding line and 
surface integrals taken over the curves and the surfaces carrying 
the sources which lie in the domain (Q). If there are point charges 
in (Q) the corresponding summands should also he added to the 
right-hand side of (57). If we understand the densities as generalized 
functions which were discussed in Sec. XIV. 7 and Sec. 19 formu- 
la (57) will he true in all cases. 

If we take a plane vector field A the divergence is defined as 


(div A) nr = 


lim 

(A 



(58) 


(formula (58) replaces formula (58) in this case]. Here (Act) is a 
small plane figure enveloping the point M and (AO is the contour 
bounding the figure. As is known (see Sec. IX. 9), a plane field can 
be interpreted in two ways. If the field is defined only in a given 
plane then, by definition, the numerator on the right-hand side 
of formula (58) is considered to be the flux of the vector field A 
across the curve (A l). But if the field A is originally defined in space 
and is regarded as a plane field because it does not depend on one 
of the Cartesian coordinates (for instance, on z) the numerator is 
equal to the flux of the vector field A through the lateral surface 
of a right cylinder [with base (An) and unit height] whose elements 
are parallel to the z-axis. In this case the denominator on the right- 
hand side of formula (58) is equal to the volume of the cylinder 
(why is it so?). 

Ostrogradsky’s formula for a .plane field has the form 


§ 


(0 


A?} dl — 


div A do 


(.o) 


where (u) is a finite plane figure and (l) is its contour. 

The divergence of a field can sometimes be directly computed 
on the basis of its definition (56). For example, let us consider a 
centrally symmetric field in space which is defined by the formula 


A = /(r) r° = 


f(r) 


r 


r 
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where r is the radius-vector of a variable point and / (r) is a given 
function of the modulus of r (see Fig. 324). Then the flux across the 
sphere of radius r is equal to 

Q (r) = j A n do A T da— j / (r)da — / (r) 4nr 2 

and therefore the number of vector lines issued from a thin spherical 
layer of width dr is equal to 

dQ = And [r 2 / (r)] = 4jt [2r/ (r) -j- r 2 /' (r)] dr 

Consequently, we have 

divA = 4/(r) + /'(r) 


which is obtained from the above expression for dQ after it has been 
divided by the volume dQ = 4 nr 2 dr of the layer. 

24. Expressing Divergence in Cartesian Coordinates. Let a Car- 
tesian coordinate system x, y, z be given in space. Then a vector 
field A can be represented in form (55). In this case we can deduce 


a simple formula for 
computing the divergen- 
ce div A. To do this we 
take into account the 
fact that the particular 




Fig. 324 


Fig. 325 


form of an infinitesimal domain (AQ) entering into the definition 
of a divergence [see formula (56)] is inessential. Therefore we can 
take a small rectangular parallelepiped with faces parallel to the 
coordinate planes as this volume (see Fig. 325). Then the numera- 
tor of the fraction in expression (56) can be represented as a sum 
of six summands corresponding to the six faces of the parallelepiped. 
We now consider the sum of two summands corresponding to the 
faces (designated by I and II in Fig. 325) which are parallel to 
the y, z-plane and whose unit outer normals are denoted as nj 
and n\ . We have (A n )i = — (^x)i> and on the basis of Taylor’s 
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iormula (see Sec. IV.15) we can write 

(A n )n = (-4*)n = (^x)i + (0*4 *)i + • • • 

where the expression d x A x = ^ Ax is the partial differential 

with respect to x which appears here because the points belonging 
to the faces I and II differ by Ax in the values of their abscissas x. 
The dots on the right-hand side of the formula designate the terms 
of higher order of smallness which are not put down here. The inte- 
gration over these faces reducing to the integration over their pro- 
jections onto the y , z-plane [i.e. over (A£2) a ], we have 

f A n do+ f A n da = j ( (A n )idydz - f j f ( A n ) n dydz = 

(i) <ii) (aqlx (An')x 

"jj { 1 §-) l Axd y ds + — (Jf (tt), ^ Al+ • • • “ 

(AO). x (AO).-c 

= Ai + • • • = (^rh, A * 4 !' 4! ' + • • • 


The dots here also designate the terms of higher order of smallness 
relative to the terms which are written in full. The subscripts I, 
II and M mean that the corresponding terms are taken for the point 
belonging to the face I, II or for the point M, respectively. The 
inscription av means the average (mean) value. When performing 
the last transformation in the above formula we have used the for- 
mula 


(-^fL = ("I t) ^infinitesimal 

and in the preceding transformation we have taken advantage of 
the formula for the mean value of a function (property 10 in 
Sec. XVI. 3). 

Performing similar calculations for the other two pairs of faces 
and summing up all the expressions we arrive at the formula for 
the flux through the whole boundary surface of the parallelepiped; 


j A - d "= 

(Ac) 


dA x 

1 dAy . 

dA z \ 

, dx 

^ dy 

dz ) 


^ v Ax Ay Az -f- . . . 


In this case we have AQ = Ax Ay Az and therefore 

hi 1 


1 

[ A- da — ( dA x -j 


, dA z 

AQ 

J A ao - \ dx ^ 

dy 

‘ dz 


(Ac) 


the 

limit we finally obtain 

the ej 


div A — dA x . 

1 dAy 

, 9A Z 


dx 

' dy 

* dz 


( 59 ) 
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We have not written the subscript M here because the formula holds 
for any point of the field. 

We can similarly find the expression of a divergence for a plane 
field. The result must obviously he the same as (59) with the excep- 
tion that the last term entering into (59) should be deleted. 

25. Line Integral and Circulation. Let an oriented curve ( L ) 
(i.e. such that the direction of describing this curve is indicated) 
be given in the domain of space where a ve- 
ctor field A is defined. Then we can form 
the line integral 



/ = 


: J A x 

(L) 


dL 


(60) 


taken along ( L ) where A x is the projection 
of the vector A on the unit vector t tangent 
to ( L ) drawn in the direction of describing 
the curve ( L ) (see Fig. 326). Since the ve- 
ctor dr goes along r and | dr | = dL (see 
Sec. VII.23) the expression for line integral (60) can be rewritten 
in the form 


I— A coscz) dr\ — j A-dr = ^ (A x dx + A y dy-\-A z dz) (61) 
<« a> <L) 

Line integral (60), or (61), is a scalar quantity possessing all the 
properties of line integrals (see Sec. XIV. 6). If the orientation of 
the curve ( L ) is changed the integral is multiplied by — 1. If the 
angle a shown in Fig. 326 is acute at all the points of the curve ( L ) 
we have I >0 and if the angle is obtuse 7 < 0. Finally, 7 = 0 
if a is a right angle at all points. We can also have 1 = 0 when 
the angle a is acute for one portion of the curve ( L ) and is obtuse 
for the other but varies in such a way that the integrals taken over 
these portions mutually cancel out. 

A line integral of type (60) has an obvious physical meaning when 
A is a field of force. As we showed in Sec. XIV. 22, in this case the 
integral is equal to the work performed by the field when the point 
upon which the force acts describes the curve ( L ). 

If ( L ) is a closed curve line integral (60) is called the circulation 

(in this case we can write the integral as <^> (A x dx + A y dy + A z dz). 

(i.) 

26. Rotation. For our further aims we need the expression of 
a circulation taken along an infinitesimal closed contour (A L). To 
obtain the expression we assume that the vector field A is repre- 
sented in form (55), i.e. as being resolved into components along 
the unit vectors of the Cartesian axes. Let the contour (A L) be placed 
near a point M o of space. Now we compute the integral of the first 
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summand entering into the right-hand side of formula (61): 

<£ A x {x, y, z)dx = § [(4x)o+ — ^ 0 ) + 

(AL) (AL) 

+ (^).(r-*J+(^).(«-^ + ..-]* ( 62 ) 

Here we have applied Taylor’s formula (see Sec. XII. 6). The sub- 
script “zero" indicates that the corresponding quantities are taken 
at the point M 0 , and the dots designate the terms of higher order 
of smallness. Recall that in Sec. XIV.23 we proved the formulas 
which can be written in the notation of this section as 


& [Ci 4- C z x] dx — 0, 

(AL) 


(V) y dx = — A S cos (n, z) 

(AL) 


and <^> zdx — AS cos (n, y) 


(AL) 


where C t and C 2 are arbitrary constants, A S is the area of the surface 
(AS) bounded by the curve (AL) (this surface can be regarded as 
a plane figure to within infinitesimals of higher order) and n is the 
unit outer normal to (AS) whose direction is coherent with the di- 
rection of describing (AL) according to the right-hand screw rule 
(see Sec. VII. 11). Hence we can write the relation of the form 


§ [(^)«+(^)„< ar - I «)-(-^r)„!'o-(-^-) 0 2o]&= 

(AL) 

= (^> [Ci + C 2 x] dx — 0 

(AL) 

Substituting all these results into (62) we obtain 
§ ^ fc = [(^), cos ( n ~!/)-(-^r) 0 o»s(nr2)]AS+... (63) 


(AL) 


Evaluating the other two integrals on the right-hand side of formu- 
la (61) in a similar manner [to perform the evaluation it is suffi- 
cient to make two successive circular permutations of the coordi- 
nates in formula (63)] and summing up the results we deduce 

* (I) A ' dT== ft (^)o COS (n ^ /) ~ (-ir) 0 C0S ^ z >] + 

+ [ (~af) 0 cos < n > z ) - (~ir) 0 cos («- *)] + 



638 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


The rotation of a plane field has a particularly simple expression. 
Indeed, if A = A x (x, y) i + A v (x, y) 3 formula (65) implies that 
in this case we have 


rot A= 


/ dA V 

dA x \ 

\ dx 

dy ) 


) k 



Fig. 328 


27. Green’s Formula. Stokes’ Formula. These formulas enable 
us to transform the circulation of a vector field over a closed contour 

to a double integral over a sur- 
face bounded by the contour. ' 
Green’s formula is related to 
a plane field and Stokes’ for- 
mula deals with the general case 
of a spatial field. The former for- 
mula directly follows from the 
latter but we shall deduce 
Green’s formula independently 
because the deduction is very 
simple. 

Let us consider the circulation 
of a plane field A = P (x, y) i -f 
+ Q ( x i li) 3 over a closed con- 
tour ( L ) which is described in 
the positive direction. The finite domain bounded by the con- 
tour (L) will be designated by (S) (see Fig. 328). By formula (61), 
the circulation can be -written as 

r = <^) P (x, y) cte-f <^> Q (x, y) dry (08) 

0 -) (L) 

For the first integral in (68) we obtain 

b a b 

j P(z, Us) dx-~ j P (x, y 2 ) dx = — ^ [P (x, y z ) — P ( x , y,)] dx (69) 

a b a 

(see Fig. 328). The expression under the sign of integration is a 
partial increment of P with respect to y which can be represented, 
in the form of an integral of the derivative: 


<F2(*) 


P (*, IJ 2 ) — P (x, = lj 


<pl(x) 


OP 

dy 


dy 


Substituting this expression into (69) we obtain 

b q>2{i) 

~f ( I -w dy ) dx= -\[ i^ dxd y 


a 


(S) 
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Tlie second integral in (68) is transformed similarly (we leave the 
calculations to the reader). Adding together the results we arrive 
at Green's formula: 

]>& + <?*,)={ (-gr--f-)d? (70) 

<« (S) do dP 

The formula is applicable if all the functions P , Q, ~ and 

are finite everywhere in ( S ). In particular, formula (70) implies an 
assertion mentioned in Sec. XV. 6 which states that if the condition 

d JL= ! ¥l bolds in a simply-connected domain ( G ) in plane the expres- 

dy dx 

sion P dx -f Q dy is a total differential in the domain. Actually, 
by formula (70), we have ^ (P dx + Q dy) = 0 for any closed 

(L> 

contour ( L ) lying in ( G ) and hence the assertion follows from 
Sec. XIV.24. The condition that 
the domain in question should be 
simply-connected implies that for 
any contour ( L ) belonging to (G) 
the portion of the plane lying in- 
side (L) also belongs to (G) (which 
may not hold for a multiply-con- 
nected domain), the implication 
being applied to the deduction of 
the above assertion. 

We now proceed to deduce an 
analogous formula (Stokes’ formula) 
for a spatial field.* The formula discovered in 1854 by the English 
physicist and mathematician G. G. Stokes (1819-1903) is widely 
applied to the theory of vector field. Let a finite oriented contour 
(L) bounding a finite oriented surface ( S ) be given. Let the orien- 
tations of ( L ) and ( S ) be coherent as it is shown in Fig. 329. 
Divide (S) into small surfaces (AGj), (A S^, . . ., (AG^,) bounded 
by the contours (ALj), (A L z ), . . ., (A L ? ). The contours (A L t ) 
(i = 1, . . ., m) are considered to be oriented according to the 
orientations of ( L ) and (£). Then we readily conclude that 

771 

§ A- dr (71) 

<L) i=i (AL ; ) 

because the integrals taken over the arcs entirely lying inside ( L ) 
and which enter into the right-hand side of (71) mutually cancel 
out (why?) and the sum of the remaining integrals just equals the 



* Can be omitted for the first reading of the boob. — Tr. 
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left-hand side of formula (71). Regarding (A S t ) (i = 1, . . ., m) 
as being infinitesimal we can apply formula (66) to each integral 

(?) A- dr which results in 

m 

& A- dr = 2 (rot„A)i ... 

(A) i=l 

where the subscript i indicates that the corresponding value of 
rot n A is taken at a point belonging to the ith area. The sum on the 
right-hand side is an integral sum (see Sec. 2), and therefore passing 
to the limit in the process when the linear sizes of all the subdomains 
are decreased unlimitedly we obtain 


<^) A*dr= rot n A dS = j rotA-dS (72) 

(L) (S) (S) 

Thus, the circulation of a vector field over a closed contour is 
equal to the flux of the rotation of the field through a surface boun- 
ded by the contour. It isjformula (72) that is called Stokes’ formula. 




The formula holds if the field A and its rotation are finite on the 
surface ( S ). It also holds if the rotation approaches infinity in such 
a way that the integral on the right-hand side of (72) converges. 

We note that a contour (L) entering into Stokes’ formula can 
consist of several portions (components). In this case the contours 
must be oriented in a corresponding way (see Fig. 330). An analogous 
remark relates to Ostrogradsky’s formula (Sec. 23). 

In particular, Stokes’ formula implies the sufficiency of conditions 
(XIV. 99) (see Sec. XIV. 24 where the question was discussed) for 
integral (XIV.93) to he independent of the path of integration. 
Actually, let us introduce the field A = Pi -f- + f?k for which 

we have rot A = 0 if conditions (XIV.99) hold. Now taking an 
arbitrary closed contour ( L ) and an arbitrary surface ( S ) spanned by 
the contour we deduce, on the basis of Stokes’ formula, equality 
(XIV.95). The domain ( G) of space in which the whole construction 
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is performed is supposed to be simply-connected which guarantees 
the existence of a surface spanned by the contour because if we 
contract the contour (L) within the above domain (G) in space to 
a point it describes a surface ( S ) of the desired type (see Fig. 331). 

28. Expressing Differential Operations on Vector Fields in a 
Curvilinear Orthogonal Coordinate System. We now consider a 
curvilinear orthogonal coordinate system A,, p, v in space. It is 
natural to construct a system of unit vectors ex, e^, e v tangent to 
the coordinate curves at each point of space and to resolve the vector 
fields in question with respect to these vectors. Thus we obtain 
a resolution of the form 


A = + A v ev 

at each point. 

To express the gradient of a scalar field u at an arbitrary point 
M we should recall that when evaluating the gradient according 
to formula (XII. 2) we can place the Cartesian coordinate system 


in any way (see Sec. XII. 1). Thus, we can put i = e>,. j 
k = e^,. This yields 


and 


grad u = ~L e;. 


d V u 


d v u 

ds., C(l_r ds 


G v - — 


1 


h. <n 


du . 1 

e;. + T 


du , 1 du 


where Z>., Z 4 and l v are Lame’s coefficients (see Sec. 15). 

When calculating the divergence of a vector field we cannot direct- 
ly apply formula (59) to a curvilinear coordinate system. For in- 
stance, if we put i = ex, etc. as above, the equality A x — A\ will 

hold only at the point M (why?) and hence the derivative can- 
not be found in such a simple way as above in the general case. 
Here we can apply the method which was used at the beginning 
of Sec. 24 in investigating the divergence. Let us consider the flux 
of the field through the surface of an infinitesimal rectangular paral- 
lelepiped bounded by the coordinate surfaces (see Fig. 332). Taking 
the sum of fluxes across the two faces perpendicular to the coordi- 
nate curve A, (on which X varies whereas p and v are constant) we 
obtain, to within infinitesimals of higher order, the expression 

d). (Ax ds^ ds v ) — d), {lpl v A}.) dp dv — - - - * — dX dp dv 

Computing the fluxes through the other two pairs of faces, adding 
together all the results and dividing by the element of volume 
dQ = Zx(p.Z v dX du dv we obtain 

. _ 1 i- g(Wx) , , djhluAA -] 

hdv.lv L dh T 9p ‘ dv J 

To derive the expression for the rotation we must take advantage of 
formula (67). The circulation of the vector ' A over the contour of 

41-0141 
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an infinitesimal rectangle perpendicular to the vector e ? , (see 
Fig. 333) is equal to 

<£ A-dr = ( (—[) — ([—[) =d ti (A v dsJ — d v (A ll ds v ) = 

NP MQ QP MN 

= 3, t (M, dv) - d v (l^dp) = d|i dv 

to within infinitesimals of higher order of smallness. Dividing by 
the surface element dS — l^ly, d\i dv and performing circular per- 




mutation of the indices we obtain the formulas 


(rot A), t 
(rot A) 


l _\ 


d "I 

~ Vv 1 

Oil 

dv J ’ 

1 

r d ( h.A ,.) 

0 (Mv) "1 


L dv 

ox j 

1 f 

- d (Zp/Ip) 

d(h.A,)l 

hh l 

d>. 

du J 


All these formulas are naturally simplified in the case of a plane 
field for which we must put A v — 0 and l v = 1 and regard all the 
quantities involved as being independent of v. 

29. General Formula for Transforming Integrals. It turns out 
that the formulas of Stokes, Ostrogradsky and some formulas in 
a multiple-dimensional space analogous to them can be written 
in the form of a single formula which generalizes them all. Let us 
take the ^-dimensional space Eh in which the Lebesgue measure 
is introduced (see Sec. 20). Consider an oriented ( p -)- ^-dimensio- 
nal (p = 1, 2, . . ., k — 1) manifold (Q) with the p-dimensional 
boundary (Q') lying in Eh- The orientation of (Q) induces the cor- 
responding orientation of (Q') according to the following rule: 
if a small ( p -f l)-dimensional tetrahedron AiA z A 3 . . . Ap-f-iAp+s 
which belongs to (fi) and whose vertices are enumerated in accord 
with the orientation of (Q) is placed so that its face . . • 
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A r +i belongs to (Q') this order of the vertices must correspond 
to* the orientation of (Q')- [Let the reader check that if (Q) is a surface 
in the three-dimensional geometric space the above rule coincides 
with the ordinary rule of the coherence between the orientation 
of a surface and its contour.] 

We now consider integrals of form (54) where (S) = (Q')- The 
element of integration 
& 

0) = 2 , m (tu • • • i t/i) dtm , . . . dt m ( /3) 

mi, ms m p =1 P P 

is a homogeneous function of degree p with respect to the differen- 
tials dti, . . ., dt h . It is called the differential form of degree p 
(p-form). 

There are some operations which can be performed on differential 
forms. For instance, we can add together forms of the same degree. 
By the way, expression (73) is a sum of the simplest forms which 
are monomials in dt lt .... Differential forms can he multiplied 
by one another under the convention that, according to the defini- 
tion of an integral of form (53), a permutation of two differentials 
entering into a monomial results in multiplying it by —1, and 
that if there are two similar differentials the monomial is conside- 
red, by definition, to be equal to zero. A differential form can also 
he multiplied by a constant or by a function of t u t~, . . .. t h . By 
the way, the latter can be regarded as a form of degree zero. The 
ordinary rules of addition and multiplication hold in this case 
with the exception that the multiplication of forms is non-commu- 
tative in the general case. 

A differential form is differentiated according to the following 
rule: 

d(0 = d ^ t<7771 , 7772, . . . , flip dtjjl^ . . . dtm p j — 


du m . ... m dtmi • • • dtm — 

p p 

du ™i m p \ 

-f" • . • dth J dtmi • • • dt m p 

where we must remove the brackets and combine the similar terms. 

e see that the differentiation of a differential form increases its 
degree by unity. 

It turns out that the above definitions imply the general formula 
lor transforming integrals (54): 



We shall not 


P times (P+i) times 



give the proof of the formula here. 


(74) 


41 * 
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As an example, let us take the case k = 2, p — 1. If we write 
x and y in place of and t 2 and introduce the notation 

© = P (x, y) dx -f Q ( x , y) dy 

we obtain 

dco = dP dx dQ dy = dx dy^ dx-\- 

+ (^& dx+ -w dy ) dy= H—w) dxdy 


It follows that formula (74) turns into a formula which differs from 
Green’s formula (70) only in the notation of the domains of inte- 
gration. Let the reader consider the case k = 3, p = 1 (which leads 
to Stokes’ formula) and the case k = 3, p =2 (which yields Ostro- 
gradsky’s formula). In treating these cases the reader should take 
into account the following expression of a flux in the form of a 
double integral with respect to coordinates: 


| A- do— jA-ndo— J [A* cos (n, a:) + A y cos (n, y)-f 

(O) (a) (a) 


+ A Z cos (n, z)j da= J J (A x dydz + A y dzdx+A z dxdy) 

(O) 


(compare this expression with formulas given in Sec. 22). 



CHAPTER XVII 


Series 


We have already dealt with series in our course. We suggest that 
the reader should look through Sec. III. 6, where the basic defini- 
tions of the convergence and of the sum of an infinite series were 
formulated, before proceeding to study the present chapter. Here 
we shall give a systematic representation of the theory of series. 


§ 1. Number Series 

1. Positive Series. We now consider a series of the form 

a-i + a z + . • • + a n + • • • V-) 

in which a n > 0 for all n = 1,2,3, Such a series with non- 

negative terms is referred to as positive series. As in oec. 
us denote the partial sums of the series as 5j, S 2 , • • •> S n , . • • • • 
In this case we have Si ^ 52 ^ ^ Sn ^ • • • (why?). There- 

fore, recalling the two possible ways of variation of an increasing 
quantity (see Sec. III. 5) we conclude that series (1) is either con- 
vergent or properly divergent (i.e. divergent to infinity), its sum 
being +°° in the latter case. This can be written as 

oo °° 

Y\ au < oo or 2 a h — 00 
fc=l h=l 

respectively. 

It should be noted that the first inequality is a symbolical expres- 
sion of the convergence of the series and it makes sense only for 
positive series. 

If besides series (1) we consider a series 

bi -I" bz • • • “h b n T" ... (2) 

such that 

(fc = 1, 2, 3, . . .) (3) 
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then 

cc 

ft — 1 

Indeed, this is directly implied by the analogous inequality for 
the partial sums of the series. Thus, we arrive at the comparison 
test for the convergence of a series similar to the one given in 
Sec. XIV. 15 for an improper integral: if condition (3) holds the 
convergence of series (2) implies the convergence of series (1) and 
the divergence of series (1) implies the divergence of series (2). 
For example, the series 

11 , 1 , 

3 2 In 2 ‘ 3 3 In 3 1 3* In 4 ' 


2 

h=; 


converges which follows from the comparison of this series with 
series (III. 6): 


1 

3 n ]n n 



(n = 3, 4, . . .) 


Although the first terms of the series do not satisfy the above ine- 
quality this does not affect the convergence (see Sec. III. 6). 

The comparison test implies an analogous test: if 

flft>0, b h > 0 (k = 1, 2, . . .) and 

— - • — — >- C where C = const 0 and co 

Ofi k-*co 

series (1) and (2) converge or, respectively, diverge simultaneously. 

Actuallv- the above condition implies that the ratio — lies between 

0 

some constant positive limits m and M for all k: 

i.e. mb k *Ca h *CMbk 

Now, summing with respect to k from 1 to n and then passing to 
the limit for n — >■ oo, we obtain 

CO oo oo 

m 2 b k < 2 ak<M 2 b h 

ft=i ft=i ft=i 

which implies our assertion (why?). 

Now let us formulate D’Alembert’s test which is sufficient for the 
convergence of series (1) and is widely applied: if the limit 


lim ° n+l - — i 

exists for series (1), the series converges in the case K 1 and di- 
verges in the case l >1. The latter assertion is obvious. Indeed, 
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if i > 1 the ratio ^ (which approaches l as n increases) becomes 

greater than unity for sufficiently large n. Hence, the terms of the 
series increase together with n when n becomes sufficiently large 
and therefore the necessary condition for the convergence of a se- 
ries (see Sec. III.6) is violated. Now let us turn to the case l < 1. 
Choose a constant number V between l and 1. The ratio approaching 

l indefinitely, we have < l' for n N where N is a fixed 

a n 

number. Thus, we obtain 

»A’+1 y gAM-2 ^ y a N+3 ^ y 

&N ’ a A r +l ' a A'+2 

which implies 

a AT+l a .\l > a N+2 “C a N+ll a N^ 2 J 
&N+3 0-N+ 2^ 3 CtC. 

Hence, after some number N, the terms of series (1) are smaller than 
the corresponding terms of the series 

c v -j- dyi' -j- Q-nI "r #a'^ "r • - • 

The quantity V satisfying the inequality 0 < [ V < 1 , the latter 
series is a geometric series (progression) with common ratio V <C 1 
which converges [see series (III. 7)]. Hence, by the comparison 
test, series (1) also converges. 

Let us consider an example. To apply D’Alembert’s test to the 
series 

CO 

2 TF («>0. P>° or p <° ) ( 4 ) 

n=l 

it is necessary to consider the limit 

: is- >= 1™ ' ( 1+ “ 

Hence, series (4) converges for a <C 1 and diverges for a >1. 
In the case a = 1 D’Alembert's test does not enable us to find out 
whether the series converges or not. 

When D’Alembert’s test fails to give the answer it is sometimes 
possible to apply Cauchy’s integral test which is more general. It 
gives, the following condition sufficient for the convergence of a 
positive series: if the expression a n can be defined not only for the 
integral values of n (n == 1, 2, 3, . . .) but also for all real n^> 1 
^ ^ a 71 4ecreases when n increases series (1) and the integral 

\ a n d }l converge or diverge simultaneously. To prove the test let 

i • ; 
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us establish the inequalities 

oo co co 

| a n dn^C 2 a n < j a n dn-\-a.i 


( 5 ) 


71=1 


which directly imply the above assertion according to the compari- 
son test. To obtain (5) we write, on 
the basis of Fig. 334a, the inequality 


On 



N 


J a n dn • • • ~r • 1 (6) 

i 

Similarly, Fig. 334b shows that 

N 

^ a n dn^ a Z '\ a 3 - a N - 1 


N 


and hence 

a l "F a 3 4“ * > * *{• a N J a n dn -f- O-i 

(?) 

If we pass to the limit as N oo in 
inequalities (6) and (7) we just arrive 
at (5). 

As an example, let us take the series 


S-3T < 8 > 

n=l ' 

which is a particular case of series (4) for a = 1. D’Alembert’s test 
does not give the answer whether the series converges because in 

this case we have lim = 1. But the integral test is applicable 

71-VOO ®7l 

here. Indeed, if we consider — n~ p as a continuous function of 
n for all real values of n (exceeding unity) we see that it satisfies 
the conditions of Cauchy’s integral test for p >0 and therefore 
series (8) and the integral 

00 

1 

converge or diverge simultaneously. 
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In Sec. XIV.15 we showed that the last integral converges only 
when p >1 [see the computation of integral (XIV.51)]. Thus, se- 
ries (8) converges only in the case p >1. In particular, for p = 1 
we obtain the so-called harmonic series 


Formulas (6) and (7) make it possible to estimate a partial sum 
of a divergent series which enables us- to derive an asymptotic for- 
mula for such a sum depending on its number. We can similarly 
estimate any sum of a large number of summands which monotoni- 
cally depend on the number. To specify the result we can separately 
sum up a number of the greatest summands in a direct -manner and 
apply the above method of estimation only to the remaining sum- 
mands because this will decrease the difference between the upper 
and lower bounds entering into a formula of type (5). 

More accurate approximations can be obtained by applying for- 
mulas of numerical integration (see Sec. XIV.13). For instance, 
let us illustrate the application of Simpson’s formula to an appro- 
ximate computation of the sum 

Sm, — + -“lT + • . • + -jp (m = 1, 2, N^>m-\-2) 


To do this we write, on the basis of formula (XIV.39) taken for 
h — 1, the relation 

h + 1 A 4-2 ft 4-2 


5 \ dx + j T ix = 1 T &w t(x 


1 


ft4-i 


fc-fl 1 fc + 2 


Adding together these results for k = m, m + 1,' . . ., N — 1 
and performing some simple transformations and the integration 
we obtain the approximate equality 


In N — In m -}-ln (iV-f- 1) — In (m-\~ 1) « 



This implies 


-r— III ( 1 + — r) + 6i V(iV-H) ( 9 ) 

Fig. 334a obviously indicates that for the function a n = — there 
exists a finite positive limit of the form 
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which is called Euler’s constant. Equality (9) and formula S m>N — 
— N ^ i imply an approximate expres- 

sion for Euler’s constant of the form 


C 


1 , J_ , . 1 | 6m + 5 

1 ‘ 2 ‘ ‘ ' m — 1 ' 6m(m-j-l) 


~ In (m 2 + m) 


whose accuracy increases with the growth of m. For m = 1 and 
m = 2 we get the approximate values 0.570 and 0.576, respectively, 
of Euler’s constant. The calculations show that C — 0.5772 to 
within 10 -4 , and thus the above values are accurate to two decimal 
places. 

For greater detail on the problem of computing sums by means 
of integrals the reader is referred to [511 (see §§ 1.2 and III.4). 

2. Series with Terms of Arbitrary Signs. We now turn to series 
of the form 

fll a 2 + • • • + a n + • • • (10) 


whose term can be of any sign. Let us form the positive series 


I I + I o-z I + • • • + I o-n I + • • • 


with terms equal to the absolute values of the terms of series (10). 
We assert that if 

S |«s|< oo (11) 

h— l 


series (10) is convergent. In this oase series (10) is said to be abso- 
lutely convergent. The proof is quite similar to that of an analogous 
property discussed in Sec. XIV.15 and we leave it to the reader. If 
series | aj | + | a 2 I + • • • + I 0>» I + • • • diverges series (10) may 
nevertheless converge. In such a case we say that series (10) is con- 
ditionally convergent. 

When we are given a general series of form (10) we can apply the 
tests given in Sec. 1 to the series of the moduli of its terms and thus 
investigate whether it absolutely converges or not. For instance, if 


lim 

co 


I a n+l 1 ^ . 

I«»l ^ 


the series [ | + | a 2 | -f- . . . -J- [ a n | + . . . converges accor- 

ding to D’Alembert’s test and hence series (10) converges absolutely. 
If the limit exceeds unity the necessary test for convergence of a 
series is violated and thus series (10) diverges. 

The following test is referred to as the Leibniz test. It is applied 
to the so-called alternating series, i.e. series of the form 

a i — a z -}- &3 — «4 -{-••- 


(12) 



SERIES 


651 


where a h >0 (k = 1, 2, . . .)• The Leibniz test asserts that if 
fll >c 2 >a 3 >. . . and lim a h = 0 series (12) converges. To 

k->co 

prove the test let us mark the points corresponding to the numeri- 
cal values of the partial sums of series (12) on the S-axis (see 
Fig. 335). Then, considering the transitions from 0 to S it from S t to S 2 , 
from£oto iS^etc. we see that every subsequent transition is performed 
in the direction opposite to that of the preceding transition and at 
the same time the corresponding distances between the points Su 


0 


Fig. 335 


S4 S3 S, 




a. 


a z 


and S k + 1 [k — 1, 2, . . .) decrease. Thus, we have 0 <C S 2 < S lt 
Sz< S 3 < S u S 2 < S 4 < £3, ... (see Fig. 335). Consequently, 
the partial sums with even numbers form an increasing bounded 
sequence and therefore they have a finite limit S' (see property 10 
in Sec. III. 5). Similarly, the partial sums with odd numbers form 
a decreasing bounded sequence and have a limit S". Taking the 
equality <S 2n+ . j = S 2n + «2n+i and passing to the limit as 00 
we obtain S n <= S'. Hence, all the partial sums have the same limit 
and thus series (12) converges. Incidentally, we conclude that the 
sum of series (12) lies between any sum with an even number and any 
sum with an odd number which enables us to estimate the sum of the 
series. 

For example, the series 

vi ( — l ) 71 " 1 
Zj nP 
n== 1 

satisfies the conditions of the Leibniz test for p >0 and therefore 
it converges for such p. For p ;> 1 the convergence will be absolute 

00 

because, as we know, the series 2 converges for p >. 1 (see 

71=1 

OO 

Sec. 1). But in the case p ^ 1 the series 2 ~ ~np * * s condi- 
tionally convergent, i.e. it is convergent hut not absolutely con- 
vergent. 
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In conclusion, after passing to the limit in the inequality 

n n 

| 2 a h I < 2 I a h I 

k= i h= 1 

we see that if series (10) converges it satisfies the inequality 

CO CO 

| 2 a h |< 2 I ah | 

i ft=i 

3. Operations on Series. 

1. Convergent series can he termwise added together (or subtrac- 
ted), that is if 

&2 "h ... On -j- . . . == S and 
bi + b 2 + • . • + b n -f- . . . = T 

we have 

(a, ± bi) -f- (a 2 ± ha) + . . . -f- ( a n ± b n ) — S ± T 

To prove this we must take the obvious expression P n = S n ± T n 
of the sum of the latter series and then pass to the limit as n oo. 

This property enables us to perform the following transformation 
of a series. Suppose we are given an absolutely convergent series 
with terms of arbitrary signs. Let us put it down in the form 

a — b — c -f d -f e + / — g S (13) 

where all a, b, c, . . . are positive. Let us form the positive series 

G + 0 + 0-j-<f + £~r/ + 0-r-*- =i ^i | 

0 + b-rc-j-0-)-0-t-0-f-g + -- - = £2 J 

Here we have separately added together all the positive terms and 
all the absolute values of the negative terms of the original series. 
Then S is equal to the difference S j — S 2 because we can term- 
wise subtract the second series from the first one. This operation 
can be performed only on absolutely convergent series because 
both series (14) diverge for a conditionally convergent series of 
form (13) (why?). In the case of a conditionally convergent series 
the partial sums of both series (14) tend to infinity and the conditio- 
nal convergence of series (13) is due to the “balance” between these 
infinities, i.e. the difference between the partial sums tends to zero 
although the sums themselves are infinitely large. 

The next property can be verified in a similar way. 

2. A convergent series can be multiplied termwise by a constant 
factor, i.e. if 

a i ~r a z + • • • -r o-n + • • • = S 
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we have 

kdi — f- k &2 -{- • • • “f - -{- • ■ • = JcS 

3 . We can arbitrarily group the terms when summing a conver- 
gent series; for instance, if 

cq + ®2 + a 3 ~f~ + fl o + a ~ + a s + • • • = $ 

we have 

(a t + a 2 ) + a s + ( a 4 + <15 + a e) + ( fl 7 + o 8 ) + • • • = S' ( 16 ) 

Indeed, if the partial sums S l5 S 2 , S s , ... of the former series tend 
to S , the partial sums of the latter series which respectively equal 
S 2 , S 3 , S 6 , S s , ... also tend to S. 

If series ( 15 ) properly diverges and we have S = oo the same is 
true for series ( 16 ). But if ( 15 ) is an oscillating divergent series (see 
Sec. III. 6) series ( 16 ) can diverge or converge and its sum will de- 
pend on the way of grouping the terms, i.e. on the way of bracketing. 
For instance, for series (III. 9 ) we have 

(1 - 1) + (1 - l) + (1-1) + ... = 0 + 0 + 0 + . ..=0 
and 

1 + ( — 1 + 1) + ( — 1 + l) + *** == l~F0-j-0 + ...=l 

Before the difference between convergent and divergent series was 
understood the above fact had been thought of as an inexplicable 
paradox. The modern definition of the sum of a convergent series 
was formulated by Cauchy in 1821 , after the theory of limits was 
created, although series were widely used as early as the 17 tli and 
18 th centuries. 

4 . We can arbitrarily rearrange the terms in a positive series 
without affecting its sum. 

Indeed, if we arbitrarily change the order of terms in a positive 
series (without omitting any of them) and take successive partial 
sums of the new series, any term of the original series will enter into 
all the sums with sufficiently large numbers. Consequently, any 
partial sum of the original series will be a part of a partial sum of 
the new series having a sufficiently large number. This implies that 
the limit of the partial sums of the original series, i.e. its sum, 
does not exceed the sum of the new series. The original series can 
be obtained from the new series by rearranging the terms of the latter 
and therefore the same argument shows that the new 7 sum cannot 
exceed the old one. Hence, these sums are equal. 

An absolutely convergent series with terms of arbitrary signs can 
also be rearranged in an arbitrary -way without affecting the sum. 

Actually, as it was shown in property 1 , an absolutely convergent 
series can be represented as a difference of two positive convergent 
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series. Hence, any rearrangement of the terms of the original series 
reduces to a rearrangement of the terms of these two series which, 
as it has been shown, does not affect their sums. 

Rearranging the terms of a conditionally convergent series we 
can make it converge to any sum and even make it diverge. The 
thing is that the partial sums of the positive and negative terms 
Isee (14)] of a conditionally convergent series are in a balance in 
the sense that their rates of growth are of the same order. When 
rearranging the terms of such a series we can change the relation 
between these rates which can lead to the above result. At first 
glance this fact looks like a paradox. We shall illustrate what has 
been said by giving a simple example. 

Take the series 



i 1_ _1 !-l_J L J_j —Jr- — 

3 4 5 6-^7 8 "i - 9 10 "T" 11 


...= 5 ( 17 ) 


According to Sec. 2, its sum lies between the limits S z = 0.5 and 
Si — 1. By property 2, this implies that 

_1 1_ J 1 j 1 _S__ 

2 4 + 6 8 ' 10 2 


and therefore we also have 

0+4-+ 0- t+°+t+ 0— T+°+4r+ 0— i+ ••• =r *4 

Adding termwise the last series and series (17) we obtain 

1+0+-3 — ^+-5-+° +-7 — fT+ 
that is 


11 


2 ‘ 5 ‘ 1 4 ^ 9^11 6 ‘ * 2 


But the above series can be obtained from series (17) by rearranging 
the terms of the latter (check it up!) and hence we see that the sum 
has been changed. 

Thus, when dealing with processes connected with a rearrange- 
ment of the terms of a series we can treat the absolutely convergent 
series as if they were finite sums. At the same time we must cautious- 
ly perform such operations on conditionally convergent series. 

4. Speed of Convergence of a Series. In practical computations 
of the sum of a series we usually compute the sum of several terms 
suppressing the others when there is no reason to think that these 
terms can essentially affect the sum (compare with the computation 
of number e in Sec. IV. 16). For this method to yield a good result 
it is necessary that the series in question not just converge but con- 
verge fast so that we could exhaust almost the whole sum when 
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taking a small number of terms, that is obtain a result approxi- 
mating the sum with a sufficient accuracy. If a series converges 
slowly it is usually inapplicable to practical computations. By 
the way, it is sometimes possible to obtain a new series by perform- 
ing some operation on the original series or to derive an approxi- 
mate expression for its remainder by applying integrals as it was 
done in Sec. 1. Conditionally convergent series usually converge 
very slowly (see Sec. 2). But there are also absolutely convergent 
series which converge slowly. 

The speed of convergence of a series is essentially dependent on 
the rate of variation of its general term when it tends to zero, as 
the number n increases. Series whose general term a n is of the order 
of n~ p [i.e. a n = 0 ( n ~ p ); see Sec. III. 11] for p ;> 1 usually con- 
verge slowly. The greater p, the better the convergence of such 
series. Series with terms a n of the order of q n , 0<; q <C 1, converge 
faster. Their convergence can be compared with that of a geometric 
series of form (III. 7), and the smaller q , the faster the convergence. 

The convergence of series whose terms are of the order of is 
still better etc. 

Of course, what has been said represents only general considera- 
tions concerning the speed of convergence, and in a concrete case 
not only the behaviour of the general term for n ->■ oo can be essen- 
tial but also the character of the first terms of the series. 

When we are given a slowly convergent series 

Q-l Un ”)-•••-{- ®71 -}“••• ( 18 ) 

we can try to pass from it to a series which converges faster. One 
of these methods is as follows. We choose a series 

bi + &2 -f* • • • -f- b n = cr 

whose sum a is known so that a n — b n for n->- oo (see Secs. III. 7, 8). 
Then we have a n = b n + y n where | y n | < | a n |, and therefore 
series (18) can be represented in the form 

(b l -f yj) -|~ ( b 2 -j~ y 2) = (61 -f- 62 + • • •) -|- (Yi 4 " Ta - }- • • *) = 

~ a + Yi + 72 + • • • + Tn + • • • 

and the general term of the latter series tends to zero faster than 
mat of the original series. 

To apply the above method we must have a set of series with 
known sums at our disposal. Geometric series (III. 7), series indica- 
ted in Sec. IV. 16, some combinations of these series and the series 

CO 

S-sr=£(P) (P>0 

71—1 


(19) 
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are most frequently used for this purpose. The latter sum (depen- 
dent on p) is called the Riemann zeta function after the prominent 
German mathematician G. F. B. Riemann (1826-1866) although 
it was Euler who was the first to introduce the function in 1737. The 
tables of the values of the function can be found in [23]. 

For instance, let us take the series 


~\/ n : 
n=l * 


3-1-1 


(20) 


Its terms are equivalent (as infinitesimals) to the terms of series (19) 
for p = y , as n -*■ oo. Hence, series (20) converges but very slowly. 
Taking advantage of inequalities (5) we can readily verify that the 
remainder after n terms of series (19) is equivalent to , 

i.e. the remainder of series (20) is of the order of 2 n 2 . Thus to 
obtain its sum S with an accuracy of 0.01 we must take about 
40,000 terms! But here we can apply the above method and write 


1 


where 


~]/ n 3 -J- 1 f/ n 3 




Yn = 


f/i! 3 — 1 /n 3 -{- 1 


1 


Vr^T^A/n 3 yn»"l/n 3 -rl(Vn 3 -f-yn3-t-l) 

Consequently, series (20) can be represented in the form 


5 =t(4)-S w 


n=i 


y n 3 y n 3 -f 1 (1 /h 3 — y« 3 -r 1) 


(21) 


The tabular value of the first summand is equal to 2.612, and the 

9 

general term of the latter series is equivalent to {In 2 )' 1 . Hence, 

7 

the remainder of the last series is of the order of (hi 2 )' 1 which means 
that to obtain its sum to within 0.01 it is sufficient to take only 
three terms! If a greater accuracy is needed we can repeatedly apply 
the method which yields 


(4) -4-t (-§-)+ 


24n 9 -f-7n G — 2n 3 — 1 


4 - y = — 

n ti 2 V (V ' i3 -r-4) (3n 3 -f 1 + y n 3 A) (2 y n 3 + 2n G — 3n 3 — 1) 

(check up the result!) where we have put, for brevity, Y n 3 - f 1 = A. 
The sum of the tabular values of the' first two terms is equal to 
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3 - 


13 


2.085, and the remainder of the last series is equivalent to ^ n 2 . 

Hence, to obtain S with an accuracy of O.pOl it is sufficient to take 
only two terms. 

The above successive application of the method can be perfected 
if we apply Taylor’s series (IV.60) (the binomial series) to the expres- 
1 

sion 


Vn 3 +1 ‘ 


V 


n°- 


■= (n 3 + l) 2 


: n 


— n 2 (l 


1 


2a 3 


8 ne 


(‘+t) - 

•••)- 


16n» 


(22) 


Suppressing the terms of series (22) following after any term we 
obtain the corresponding approximation. For instance, dropping 
the terms after the third, we obtain 

, 3 


~|/n 3 + 1 


= n 2 


(‘ 


1 

1- 3 \ 

2n 3 

' Sn 6 ) 


)— Y* 


(23) 


which yields 

S = i (t) -is (4) +T? (x ) - 2 *=2.462- 2 Y» 

n=l n=l 

The terms can be found from (23). On the basis of (22), we con- 

elude that they are equivalent to jg n 2 . 

Many other series can be transformed in a similar way. Besides 
(19) we also use the series 


2- 

n=l 


. l)n+l 

- y — 

V 2 . 

JlV 

Zj nP 

(2n)v “ 


n=l 

n=i 

CO 

2 

71 = 1 

1 

CO 

«(n + 1 ) 

“ 2j u 

n=i 


CW(l-^r), (24) 


n=l 

and so on. 

42-0141 


~ (* ~t) + (t~t) + (t-t) 

00 1 °o 

2 n (n 1) (n -f- 2) 2 s[- »(« + !) 


1, 


n=i 


±r 

( 1 _ 

- 1 \ + / 

' 1 

1 

2 L 

\ 1-2 

2-3 j ‘ \ 

, 2-3 

3-4 


(n + l)(n + 2) 

w) + -]-4 


]“ 
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Formula (24) also holds for 0 < p < 1 hut the function £ (p) 
cannot be defined by formula (19) in this case since the series di- 
verges, and it is therefore defined by means of another method which 
we shall not discuss here. ‘Formula (IV.61) implies that, for p — 1, 
the left-hand side of formula (24) is a convergent series whose sum 
is equal to In 2. 

When we deal with alternating series it is usually very difficult 
to guarantee the desired accuracy in practical computations. For 
example, let us take series (IV.56) for the cosine putting x = 100: 


cos 100 = 1 


10Q2 

2 ! 


100< 100 c 100S 

4! 6! + 8! 


(25) 


The series on the right-hand side of (25) is convergent and even 
absolutely convergent (why?). But we cannot use it for practical 
calculations. The thing is that although its terms beginning with 
the 51st decrease sufficiently fast so that the theoretical convergence 
is guaranteed they become enormously large before that. The whole 
sum does not exceed unity in its absolute value and hence all these 
large terms must almost completely mutually cancel. As we know 
from Sec. 1.9, in such circumstances, in order to achieve the desired 
accuracy, we must carry out all the calculations with a great number 
of significant digits and hence perform much unnecessary work. 
Therefore we should avoid using series of type (25). If such series 
occur we must transform them to other series convenient for prac- 
tical calculations. For instance, in the above example we can take 
advantage of the periodicity of the cosine and pass to a considerably 
smaller value of the argument. 

5. Series with Complex, Vector and Matrix Terms. The definition 
of the convergence and of the sum of a series with complex terms 

z i + z 2 “f" • • • “f" z n "F • • -i z n = %n -f" tyn (/I — 1, 2, 3, . . .) (26) 


is completely the same as that of real series (see Sec. III. 6). Seri- 
es (26) is usually reduced to two series of the form 

#1 + £2 + • • • + x n + • • • an d I/l + 1/2 4* • • • + J/n + • • • (27) 


If both series (27) converge and have the sums x and ij, respectively, 
series (26) also converges and has the sum z = x + iy. If at least 
one of the series (27) is divergent series (26) is divergent as well. 
Series (27) having real terms, we can apply the methods of Sec. 2 
to them. 

The following test is also of use: if 


(28) 

71=1 


both series (27) are absolutely convergent and therefore series (26) 
is also convergent. If (28) holds series (26) is said to be absolutely 
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convergent. The methods of Sec. 1 can be applied to a series satis- 
fying condition (28). 

A series of the form 

ui -f- u 2 -f- • • • + Un + * • • (29) 

whose terms are vectors is treated in like manner. If all the vectors 
u„ (re = 1,2, . . .) belong to a three-dimensional space we can pass 
to the corresponding series with scalar terms by projecting series 
(29) on the x, y and z-axes. 

We can also consider a series of the form 

Ai -f- A 2 + • • • “r A n + • • • (30) 

where the terms A„ (n = 1, 2, . .. .) are matrices. For series (30) 
to converge it is necessary and sufficient that each of the series 
formed of the corresponding elements of the matrices (Sec. XI. 2) 
(that is of the elements standing at the intersections of the same 
rows and columns of the matrices) should converge. 

The properties of series (26), (29) and (30) are the same as those 
of real series (see Sec. 3). 

6. Multiple Series. Finite sums can have more than one index 
of summation. 

For instance, 

2 3 

2 2 ®O' =: flll-r0l2 + a 13 _ i _G 21 _ f' a 22 + G 23> 
i=l j=l 

y y_L_2_4_ 1 . 1 . 1 I 1 . 1 . J_4_._L.4_ 1 . 1 

and the like (compare with Sec. XYI.8). 

Infinite series can also have more than one summation index; 
these series are called double , triple etc. and, generally, multiple. 
We shall restrict ourselves to investigating a double series of the 
simplest form 

CO 00 

2 2 ats (31) 

i= 1 J=1 

The series of higher multiplicity and also the series having variable 
summation indices in the inner sum (the latter finite sum belongs 
to this type) are treated similarly. 

Let us first suppose that all an 0. We arrange all the terms 
of series (31) in an arbitrary order and form an ordinary series, 
tor instance, w r e can arrange them as follows: 

G 11 + g I 2 + g 21 + g 13 + G 22 + g 3I 4- a u -f fl 23 -}- 

+ g 32 + g 41 + • • • (32) «■ 

42 * 
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It is the sum of this series (which does not depend on the order of 
the summands; see property 4 in Sec. 3) that is called the sum of 
series (31). There can be two cases here, the convergence and the 
divergence, i.e. 

CO CO CO CO 

either 2 2 a i 3 <°° or 2 2 a U— 00 
1=1 j—l 1=1 j=l 

Hence, for a,-j ^ 0, the sum of series (31) is independent of the 
order of summation hut of course we must not omit a single term 
when forming a series of type (32). In particular, we can perform 
the summation in the following ways: 

CO CO CO CO OO CO 

2 2 a u— 2(2 a u) = 2(2 a ij) ( 33 ) 

i=l 3=1 i=l 3=1 3=1 1=1 


If the terms an are of any sign or are complex numbers, the se- 
ries satisfying the condition 


2 2 I a U I < 00 (34) 

i=i j=i 

represent the simplest case in which we call the series absolutely 
convergent. If condition (34) holds the series (31) also converges 
and its sum can be computed by means of any formula of form 
(32) or (33) or with the help of the formula 


22 

1=1 3=1 


an = lim 

M, N-*oo 


M N 


2 2 a u 

1=1 3=1 


and the like. If condition (34) is violated the result of a summation 
of series (31) may depend on the order of the terms (see Sec. 3), 
and then we have a more complicated case. 

In particular, a double series is obtained when we multiply two 
absolutely convergent series 

oo OO OO 

= 2 a i and s 2 = 2 2 bj 

1=1 i=l 3=1 

Before performing the multiplication we have changed the notation 
of the summation index in one of the series. The multiplication 
can be performed as follows: 


^ 2 = 2 «/ 2 bj~ 2 (a, 2 bj)= 2(2 atbj) =22 

1=1 3=1 1=1 3=1 1=1 3=1 1=1 3=1 


When writing the last equality we have applied formula (33) because 
the series are absolutely convergent. Thus, series of this type are 
multiplied according to the rule of multiplication of finite sums 
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(i.e. each tern of the first series is multiplied by each term of the 
second series and so on) whieh results in an absolutely convergent 
double series. 

An analogous result is obtained when we multiply a greater num- 
ber of absolutely convergent series. 


§ 2. Functional Series 

7. Deviation of Functions. If the terms of a series are not numbers 
(as in § 1) but functions there appears a question in what sense we 
must understand the fact that the partial sums which are functions 
approach the sum of the series (which is also a function) . in the case 
of convergence. Hence, the question is how to estimate the deviation 
of one function from another. 

It turns out that we can do this in different ways which are not 
equivalent to one another whereas the deviation of two numbers 
e and b is always characterized by the quantity i a — b |. 

Let ns be given two functions / (x) and g (x) defined in the same 
finite interval a ^ x ^ b. The quantity 

max j f {x) — g (x) [ (35) 


is called the maximal (uniform) deviation of the functions / and 
c from each other. It is also sometimes referred to as Chebyshev's 
deviation. The geometric meaning of the quantity is illustrated 
in Fig. 336. This notion can be applied only to boimded functions. 
As a rule, we use it wben continuous functions are considered. If 
the uniform deviation of two functions is small tbe difference bet- 
ween the values of / (x) and g (x) is small at each point x of the 
interval g ^.x ^ & and vice versa. 

The quantity 

b 

J i/(x)-g(x)jtfx (36) 

G 

^ ca ^ et ^ mean deviation of the functions f and g from each 
other. Its geometric meaning is implied by Fig. 336: the quantity 
eqnals the area shaded in the figure (taken in its absolute value), 
the so-caUed mean square deviation is defined as 



[/(x) — g(x)]-dx 


(37) 


which is analogous to (36) in manv respects but is more convenient 
ior captions. Deviations (36) and ‘(37) me nseT no? ”3 [“to 
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ratinuous functions but also for unbounded ones if the integral 
vhick is improper in this case) is convergent (Sec. XIV. 16). There 
'e some other forms of deviation which are also of use but we do 
Dt discuss them here. 

If we replace the difference / (x) — (p ( x ) entering into formulas , 
S6) and (37) by maximal deviation (35) the integrals can only 
icrease and hence we obtain 


f [/ (x) — <p (x) | dx<(b — a) max | / (x) — cp (x) [ 

*■ az^x<b 


and 


{f (x) — y (x)] 2 dx ccY b — a max | / {x) — cp (x) | 

ac^x^b 


(38) 


onsequently, if the maximal deviation of two functions from each 
ther is small their mean and mean square deviations are also 




mall. At the same time it can happen that the maximal deviation 
f two functions is large whereas their mean deviations are small. 
Che possibility is illustrated in Fig. 337. 

8. Convergence of a Functional Series. We now consider a series 
)f the form 

fi ( x ) + fz (#) + • • • + / n ( x ) + • • • (29) 

/hose terms are the functions f h (x) {k — 1,2, . . .) defined over 
he same finite interval a ^ x ^ b. 

We say that the series converges to a function S ( x ) on this inter- 
nal which is called the sum of the series if the deviation of the par- 

n 

ial sum S n (x) = Vj f h ( x ) from S (x) tends to zero as n increases . 

h=i 

Depending on the form of the deviation (see Sec. 7) we speak about 
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the type of the convergence of series (39). For instance, if 
max 1 5 (#) — S n ( x ) j *- 0 

a^b' n_vo ° 

we say that series (39) uniformly converges to its sum 5 ( x ). Simi- 
larly, we say that the series converges to S (x) in tlie mean or in 
the mean square* depending on whether we have 


or 


J 


S ( x ) Sn {x) j dx 0 



[S(x)-S n (.r) j 2 dx > 


0 


Inequalities (38) indicate that if series (39) converges uniformly 
it also converges in the mean (of any order) to the same sum. The 
converse statement may not be true in the general case. 

If series (39) uniformly converges to the sum S ( x ) on an interval 
a ^ x ^ b we have 

fi ( c ) + fz ( c ) + • • • + /n ( c ) + • • • — S ( c ) 


for each number c belonging to the interval. Actually, the difference 
between the value of the nth partial sum of the series at the point c 
and iS (c) does not exceed the maximal deviation of S n {x) from 
S ( x ) (why?) and therefore it tends to zero as re— >- oo. This property 
makes it possible to obtain numerical series with known sums from 
a functional series when its sum is known. 

To test a series for uniform convergence we usually apply 
Weierstrass’ test whose condition is sufficient for the uniform con- 
vergence: if all f n (x) (a ^ x ^ b) satisfy the inequalities 


/n(x)j<a„ (re = 1, 2, 3, 




and 


OO 

V 

XJ 

71 = 1 


a n <C oo 


(40) 


* Generally, we say that a series with partial sums S n (x) converges in the 
mean of order p to its sum S ( x ) on an interval a x <1 b if 

b _L 

lim ( \ j S (x)—S n {x) pda;) P =0 
n->oo \ J / 

a 

Hence, in this English edition the term “convergence in the mean” is understood 
as convergence in the mean of order one” and the term “convergence in the 
mean square” as “convergence in the mean of order two”. When the term “con- 
vergence in the mean” is used in mathematical hooks without qualification 
it is sometimes understood as “convergence in the mean of order two” and some- 
times as convergence in the mean of order one”. — Tr. 
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series (39) converges uniformly. To prove the assertion we take 
advantage of the comparison test (see Sec. 1) and conclude that 
series (39) absolutely converges (as a numerical series) to a sum 
S ( x ) for each fixed value of x. At the same time 

oo 

max | S (x) — S n (x) | = max ) 2 A ( x ) I < 

a^x^b a<:x<^b h=n-\-l 

oo oo 

< max 2 I A ( x ) « 2 a h 

a^x<:b fc=n-f- 1 k=n- f- 1 

Since the last sum is the remainder of a convergent series (see 
Sec. III. 6) it tends to zero as re oo. 

Condition (40) can also be written as 

oo 

2 max | f n (x) | < oo 
n— 1 a^x^b 

because the terms of the latter series can be taken as a n . The tests 
for convergence of series (39) in the mean or in the mean square 
are similar to the above test. The conditions which are sufficient 
for these types of convergence are put down, respectively, in the 
forms 

oo 6 oo lb 

2 j |M*)|<te<°° and 2y j IA A)] 2 dx< oo 

n=l a n=i a 

but we shall not give the proof here. 

There are cases when series (39) is divergent on the whole interval 
a ^.x ^.b but convergent on some subinterval ^ x ^ by for 
which a ^ a { < h, ^ b. The interval ^ x ^ b l is called the 
domain of convergence of series (39). 

In conclusion we note that an arbitrary variation of a finite num- 
ber of terms of series (39) does not affect its convergence or divergence 
(although this can change its sum). This property is analogous to 
the corresponding property of number series (see Sec. III. 6). 

9. Properties of Functional Series. 

1. The sum of a uniformly convergent series whose terms are 
continuous functions cannot have discontinuities. Indeed, if 

U (*) + /£(*) + . . . + A (x) + . . . = S (x) (41) 

(a x b) 

we have 

S ix) = [/i {x) -f- . . . f n (a:)] -f- [/ n+1 {x) + / n+2 (x) + . . .] = 

= (x) + R n (x) (42) 

If the terms of the series are continuous functions, S n (x) is also 
continuous as a sum of a finite number of continuous functions (see 
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Sec. III. 14). On .the other hand, if series (41) is uniformly conver- 
gent its remainder R n (x) will be arbitrarily small over the whole- 
interval a ^ x b for sufficiently large values of n. Therefore 
a small variation of x yields small variations ‘both of S n (x) and 
R n (x), and thus the whole sum (42) gains a small increment as well 
which means that the sum cannot have discontinuities. 

We sometimes consider series of form (41) on a finite or infinite- 
interval a < x < b which uniformly converge not on the whole 
interval but only on each proper subinterval Oj ^ x ^ bj lying 
entirely in the interior of the former interval. Then we can apply 
the above property to the interval ^ x ^ b t and then, making- 
oj approach a and b t approach b, conclude that the sum of the series 
cannot have discontinuities in the original interval a ■< x < &• 
Analogous conclusions are also true for the properties enumera- 
ted below. 

If the terms of a series of form (41) are discontinuous we can apply 
the above argument and conclude that if series (41) converges uni- 
formly its sum can have discontinuities only at the points where the 
terms are discontinuous. In contrast to it. if a series converges in 
the mean its sum can have new discontinuities and. moreover, it 
can be discontinuous even when all the summands are continuous, 
functions. This is connected with the fact that continuous functions 
S n (x) can converge in the mean to a discontinuous function, as it 
is illustrated in Fig. 270. 

2. A uniform^* convergent series can he integrated termwise, 
i.e. under this assumption (41) implies, for any x 0 and x from the- 
interval a < x ^ b, that 

x x x x 

j hit) dt + j hit) dt+ . . . -L. j f n (t)dt+ . . . = j S ( t ) dt 

a "0 -to .To .To 

where the series thus obtained is uniformly convergent on the inter- 

val a ^ x ^ 6. In fact, we have 

* « • x * n 

| [$(/)*_ 2 J |>(o-2aw]*|“ 

a 0 1=1 .To .To k = i 

* CC 

-II IS (<) - Sn (0] (t) -Sn it) \dt\< 

b X ° 

< \\S(t)-S n (t)\dt*C(b-a)- max I S (t) — S n ( t ) I >- 0 

a a^izsb n-t-co 

• case convergence in the mean we can write the same- 

inequalities omitting the last one and thus prove that a series con- 
' ergent in the mean can he integrated term-by-term and that the 
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series obtained, after the integration uniformly converges on the 
interval a ^ x <1 b. 

3. A series with continuous terms can be differentiated termwise 
if this results in a' uniformly convergent series, that is under these 
assumptions (41) implies 

f[ (*)+£(*) + ...+&(*) + •..= S' (*) 

To prove the property we denote the sum of the latter series by 
Q (x). Then integrating this series term-by-term (which is permis- 
sible on the basis of property 2) we arrive at the equality 

S(x)-S{x 0 )= \ Q(t)dt 
* 0 

Finally, differentiating the last relation, we obtain Q (x) = S' ( x ) 
which is what we set out to prove. 

It is possible to specify the notion of convergence of a func- 
tional series with the help of generalized functions (see Sec. XIV.27) 
and then all the restrictions imposed on the character of the con- 
vergence may be dropped and we can integrate and differentiate 
termwise any convergent (in the generalized sense) series. 


§ 3. Power Series 

10. Interval of Convergence. A power series is written in the form 

a 0 + a { x + a z x 2 + ... + a n x n + . . . (43) 

We have already encountered series of this type in our course (see 
Sec. IV. 16). Now we proceed to give the general theory of such 
series. For the sake of simplicity let us suppose that there exists 
a finite or infinite limit of the form 


£ (44) 

n^+co I a /i+l I 


although the final results of our investigation will be valid for the 
general case. 

We can easily find out for what numerical values of x series (43) 
converges. Since 


n-voo 

\<hx xn \ 

- — um = 

n-4-OO °n 




a n+ 1 



(45) 


we conclude, on the basis of D’Alembert’s test (see Sec. 2), that 
series (43) is absolutely convergent for | x | < R, i.e. for 


— R CxcR 


(46) 
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Interval (46) is referred to as the interval of convergence of power 
series (43), and R is called the radius of convergence. Series (43) 
diverges for | x | >R, that is for — oo < x < ~R and R < x < oo. 
Indeed, outside the interval of convergence limit (45) exceeds unity 
which implies the divergence. Limit (45) is equal to 1 for x = ±R, 
that is D’Alembert’s test is inapplicable to the end-points of the 
interval of convergence. Examples show that depending on the 
particular properties of a power series it can be convergent or diver- 
gent at an end-point of its interval of convergence. 

Limit (44) does not exist in some cases but even then the interval 
of convergence can sometimes be found by means of D’Alembert’s 
test. As an example, take the series 

<7 *3 rC <7*9 rl2 

"2722" 1 "3^p 471F “‘“Trip ’■* 

Limit (44) does not exist for the series (why?). The series converges 
for the values of x which yield 

f l^ (n+11 l . i* 3n l ) _ M 3 ^ 

L (n + 2) 22 cn«> • (b+ 1) 2 2n J 22 ^ 

Hence, the series converges for | a? | < 2~ = 4, that is its interval 
of convergence is 4 < x < 4. The series diverges at the end- 

point x = — jX4 of the interval of convergence and conditionally 
converges at the end-point x — T X4" (check up these assertions!). 

If D’Alembert’s test is inapplicable we can nevertheless prove 
that the domain of convergence of a series of form (43) is an interval 
of type (46) but it is more difficult to find the value of R in this case. 

If i? — oo series (43) converges for all x , i.e. over the whole x-axis, 
although it can be inapplicable for practical calculations when the 
values of | a: | become large (see the end of Sec. 4). The case R = 0 
is also theoretically possible. But then series (43) converges only 
at the single point x — 0 and therefore we shall not treat such series 
here. 

Let the reader verify that the radii of convergence for series 
(IV.55)-(IV.61) are equal, respectively, to oo, oo, oo, oo, oo, 1 and 1. 

We also consider power series of the form 

a 0 + fl t (x — c) + g 2 (x — a) 2 + . . . + a n {x — c) n + . . . (47) 

Denoting x — a .by x i we reduce the series to form (43) (with respect 
to the new variable Hence, the series converges for 

—R < a; — a < R, i.e. for a — R c x C a + R 

11. Properties of Power Series. 

1. A series of the form 

a 0 -{- ojx -f- a^x 2 + . . . -f On-'r" + . . . 


(48) 



668 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


uniformly converges (see Sec. 8) on each interval — Ri ^ x ^ R t 
where 0 < i? t < R and R is the radius of convergence of series 
(48). Indeed, we can write 

I a 0 i = | a 0 |, | d!® |< | diili |, | a 2 x 2 1 < | a 2 R\ j, 

| dgx 3 | < ! a 3 R\ |, . . . 


for such an interval and hence the terms of series (48) do not exceed 
the corresponding terms of the series 

I d 0 I + I dji?! | + | a 2 R\ I + I d 3 f?j | + . . . 

in their absolute values. The latter series converges since R t lies 
inside the interval of convergence. Thus, by Weierstrass’ test (see 
Sec. 8), series (48) uniformly converges on the interval. 

In the general case a power series may not uniformly converge 
on the entire interval of convergence. But Abel proved that if series 
(48) converges at an end-point of the interval of convergence, the 
interval on which the uniform convergence is guaranteed can be 
extended to this end-point. 

2. The sum of series (48) is continuous inside its interval of con- 
vergence. Indeed, this follows from property 1 in Sec. 9. Besides, 
Abel’s theorem mentioned above implies that if series (48) converges 
at an end-point of the interval of convergence its sum is continuous 
at the end-point. 

3. Term-by-term differentiation or integration of series (48) 
does not change its radius of convergence. For instance, integrating 
series (48) termwise we obtain the series 


d 0 Z -f 4p Z 2 + -y- X 3 + 


djl~i 


a n r n- hi j__ 


1 n 1 n -\- 1 ‘ 

Let us compute its radius of convergence: 

1 a n-i j 

lim- (ft ~ 1 ~ 1) \y~ l ) = Iim^-ti lim l -^=±L =i -R 


lim — j-— r 

n-+oc I a n 1 


ra+1 


n-*co I “n I 


Thus we obtain the same value of the radius, see (44). 

4. The relation 


a o + dpr + ...-}- a n x n + . . . = S (x) ( — R < x ■< R) 

can be termwise integrated and differentiated any number of times. 
This follows from properties 1 and 3 proved above and properties 
2 and 3 in Sec. 9 because if the radius does not change when we inte- 
grate or differentiate once it cannot change when the operations are 
performed repeatedly. 
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In particular, properties 4 and 2 imply that the sum of a power 
series possesses continuous derivatives of all the orders within its 
interval of convergence. 

For example, let us take the series 

g - = 1 — .r 2 -}-# 4 — X 6J rX 8 — ... ( — 1<2<1) 

which can he obtained from series (IV. 60) by putting a = — 1 or 
by applying formula (III.7) for the sum of an infinite geometric 
progression (with common ratio less than unity in its modulus). 
Integrating termwise we obtain 

J -~^dx = avotanx = x—^-+~— . . . (49) 

o 

( — 1 < .T< 1) 

The series on the right-hand side being convergent for x — 1 as 
well, Abel’s theorem implies the validity of formula (49) for x — 1. 
Hence we have managed to find the sum of an interesting series of 
the form 

1 "3'+‘ 5 t^~!T * * * = arc tan 1 =-£- 

Performing termwise integrations and differentiations of a given 
series we can sometimes reduce the series to a series whose sum is 
known and thus find the sum of the series in question. As an example, 
let us find the sum of the series 

2 -j- -jj- x -j- -gj- x- -f -gj- x? + . . . —S (x) 

By D’Alembert’s test, we readily conclude that the series converges 
throughout the whole z-axis, that is R = oo. Let us multiply both 
sides by x and integrate the result from zero to some x: 


j xS dx — +-fr +1T + ■ • • = 

o 

= a : 2 ( 1 + 'fr+-|r + -^--r •- •) =^ 2 e K 

We obtain, by differentiating, 

xS (x) = (zV)' = 2xe x -f X 2 e x 

i.e. finally we have 

S (z) = (2 + x) e x 
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Let us consider an example of different kind. Let it be necessary 
to find the sum of the series 


2 2 , i 3 . ar 1 . x* , / > 

0 + + + 


(50) 


For this purpose we differentiate it termwise: 

t 2 . x? t x ^ 


= — In (1 — x ) 


<>'(*) =f+ir+ir+-x+ 

[see formula (IV.61)]. It follows that 

a (x) = — J In (1 — x)dx = — x In (1 — x) — j ^ x ~ 

= x -f- (1 — x) In (1 — x) + C 


(51) 


To determine the value of C we put x = 0 in formulas (50) and (51). 
This yields 0 = u (0) = C. Finally, 

a (x) = x + (1 — x) In (1 — x) 

In other examples of this type we can encounter integrals which 
are inexpressible in terms of elementary functions. 

The sum of a functional series can sometimes be found by forming 
a differential equation which is satisfied by the sum and solving it. 
For instance, let us find the sum of the series 


j;10 


r 13 


II “r 4! ^ 71 T 10! ~r 13! -h • • ‘ P 
To do this we differentiate formula (52) three times: 

ar 5 , z e , x s , 


1 


3! 


6! 


9! 


i 8 


121 

■li 


...=p'(x). 


ir+ir+ir+TTr+...=/>'>), 


5! 


8! 


11 ! 


x . ar* x 1 xlO , , 

+ 4r - r"7r+'T7?r-r ••-=/> fc) 


11 


10 ! 


(52) 

(53) 

(54) 


We see that we have arrived at the original series, i.e. 

P'" (*) — p {x) = 0 

Applying the methods of Sec. XV.17 and solving the equation we 
find: 

p(x)=C i e x + e z (c z cos^0-x + C 3 sin^- x) (55) 

To compute C t , C 2 and C 3 we substitute x = 0 into formulas (52), 
(53) and (54) which results in p (0) = 0, p' (0) = 1 and p” (0) = 0. 
These values determine the initial conditions for p (x). By formula 
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(55), we obtain 

C t -{-C z = 0, Ci — j ^2 + "^2~ ^3 — ^ 5 
which implies 



Ci - 



V3 

2 


c 3 = 0 



Finally, the sum of series (52) is expressed as 


r 10 


1! 


4! 1 7! 


10! 


1 X 

•=T e ■ 


(-4 


ys . 

3 cos -y- 2 : + 


1 


V3 


sm 


1/3 


x\ 


In other cases we can sometimes similarly reduce the sum of a 
given number series to an integral or even to a simple combination 
of mathematical constants (i.e. to integers, numbers n and e etc.) 
and functions of the constants. As an example, let us take the sum 


12 + 2 2 + 32 + * * * S 

To compute it, we introduce an auxiliary series of the form 


(56) 


7T+ oT+iyd* • • • — ? 0*0 ( l-<.r<il) 


l 2 ' 22 ‘ 3' 
Differentiating the series we derive 

X 

q(x) 


-s 

o 


In (1 — x) 


dx 


(check it up!). Substituting x = l we obtain the sum of series (56): 

S = - j — dx (57) 

o 

The corresponding indefinite integral is not an elementary func- 
tion but nevertheless it is sometimes preferable to express the sum 
of a series in the form of a definite integral. By the way, in Sec. 25 
we shall give another method of computing the sum of series (56) 

which will show that the sum equals . From this, in particular, 

we obtain the numerical value of integral (57). 

12. Algebraic Operations on Power Series. The power series being 
absolutely convergent in the interior of their intervals of conver- 
gence, we can termwise add them together, multiply by a common 
factor (see Sec. 3) and multiply them by one another following the 
rules of multiplication of polynomials (see Sec. 6). 
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For example, let us consider the multiplication of the power 
series expressing the functions e x and In (1 -}- x): 

€*ln(l + x)== ( 1 + iT + -fr+-fr+ •• •) ( x 2"+-! — • • •) = 

= x + (tt~ t) **+ (4— 2TTT+t) **+ 

. / 1 1.1 1 \ / 1 1,1 1 . 1 \ , , 
+ V~3T 2*2! 3-1! 4 j x \ 4! 2-3! 3-2! 4-1! 5 ) ^ 

+ . . . =z + y x2 + ^Z 3 + -^ xi + -^o x5 +--- 

Obviously, we can compute as many coefficients entering into the 
product as needed. The radius of convergence of the first series is 
equal to oo whereas that of the second series is equal to 1. Hence, 
the above result is valid for the interval — 1 < x < 1 in which 
both series are absolutely convergent. 

The division of series is performed in a similar way. For instance, 
consider the ratio of the power series for sin x and cos x. 

To perform the division we take these series arranged in ascending 
power of the variable and divide them as if they were polynomials: 


sinx 
cos x 




SERIES 


G73 


Thus, the expansion of the tangent into a power series is of the 
following form: 


tan x • 


sini 


cos a: 


■ = x- 


' 1 15 X 


17 

315 


x‘ 


(58) 


To compute the subsequent terms in (58) we must put down the 
subsequent terms in the expansions of sin x and cos x and continue 
the division. It can be proved that formula (58) is valid for | x | <C 

<T 

Expansion (58) can also be obtained by means of the method 
of undetermined coefficients. To apply the method we note that 
tan x is an odd function and therefore its expansion must contain 
only the odd powers of x: 


tan x = a.yX + a^x 3 + a 5 x 5 + a-x 7 4 


But cos a:- tan x = sin x, and therefore 

( 1- "!r+ir _ lr + • • •) ( QiX + 03X3 + a ° x5 ^ a ~ ,x ' ^ 

* .T 3 , x' J £! 

■ 1! 3! 5! 7! ' ‘ * 

Removing the brackets and equalling the coefficients in like powers 
of x we obtain: 

1 a ! 1 a 3 , ay 1 

"TT’ a3 IF 3!~ ’ ° 5 2!~ 5!~ ’ 


_ _f3, L 

7 2! 1 4! 6! ~ 7! ’ " * 

Hence, the coefficients a l} a 3 , a 5> ... can be found in succession 
from the above relations. 

Another method of manipulating series is the substitution of 
a series into a series. For instance, we can substitute a series of the 
form 

V = f (x) = a 0 + ayx 4 - a z x- 4 * • • • 

into the series 

9 (f/) — &o 4“ hi y + b z y 2 + . . . (59) 

or, in a more general case, into the series 

4 (y) = Co 4- ci {y — a) 4- c 2 (y — a) 2 + . . . 


43-0141 
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For instance, taking series (59) we obtain the expression 

9 (/ ( x )) = b 0 + ( a 0 4* a i x + a 2 x ~ +■■•) + 

-r b 2 (a 0 + a t x + a 2 x 2 + . . .) 2 + • • • 


in which we must remove the brackets and combine similar terms. 
For the result to be valid for x = 0 it is necessary that series (59) 
converge for y = a 0 (why is it so?). This means that a 0 must lie 
within the interval of convergence of series (59). By the way, if the 
condition is not fulfilled the calculations themselves will indicate 
the mistake. 

Take an example: 


i / 1 , • \ sin x 

In (1 -f- sin x ) = — — 


(sin x)- 
2 


(sin x) 3 
3 


t 3 t/ 

X 6~ + 120~5d40 + •’ 

/ X 3 X 5 

6 _+ T2d~" 


1 

2 


/ x 3 , x 5 \ 3 

X 3 X 5 

50¥j + "- 

, r 6 h 120 “ ’ / 

x <r 1 120 

1 3 

1 


7-4 ? 

X 2-~— L_ x 6M 

* 3 + 45 X ' , 

x 3 — x 7 

2 ' 120 ' ••• 


2 ' 

3 


2 

x i — =-x 8 -4-... x 5 — 

d ! - 

1,70- 

6 ' x 8 + ... 

X7+... 

4 1 

5 6 

r 7 


When calculating in succession the powers entering into the resulting 
series w r e have performed the multiplications for the Taylor series 
of sin x according to the rule of multiplication of polynomials and 
dropped the powers of x higher than the power corresponding to the 
desired accuracy of calculations (higher than x % ). Now, combining 
similar terms we finally derive 


In (1 + sin x) = x 



x 8 61x7 
45 "r" 5040 


The methods discussed in Secs. 11, 12 make it possible to obtain 
the expansions of many other functions by taking advantage of the 
simplest series given in Sec. IV.16. It is sometimes difficult to write 
down the explicit expression of the general term of such an expan- 
sion but at the same time we can always find any number of terms 
which is usually sufficient for practical purposes. 
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13. Power Series as a Taylor Series. We now consider tlie sum of 
a series of the form 

/ (x) =* «o 4* a \ x + a z x ~ + • • • 4- o-nX n + • • • (60) 

(-R <x<R) 

The coefficients of the series can be easily expressed in terms of 
its sum. For this purpose we perform successive differentiations of 
formula (60) and substitute x = 0 into the results, as it was done 
in Sec. IV.15. This yields 


/ ( 0 ) = a 0 ; 

f (j) = laj -E 2a z x + 3ff 3 x 2 4- . . ., /' (0) = ia t ; 

f {x)=i'2do- J ^2‘3a 3 x-\-3-A(i i x --\- . . / (0) = l-2a 2 ; 

f" (x) = 1 • 2 • 3« 3 -|-2*3* 4ff 4 x + 3 • 4 • 5a b x z 4- — » /"' (0) = 1 • 2 • 3a 3 etc. 


Finding a 0 . a,, a 2 , . . . from these relations and substituting 
them into (60) we obtain the expression 


f( x )-f (0) 


/' ( 0 ) 
1! 


x - 


JIM. 

21 


X“- 


r ( 0 ) 

3! 


a: 3 4~ 


/ (11) ( 0 ) 
n! 


x n + 


( — R<x<R) 


(61) 


which is nothing hut Taylor’s series (IV.54) we have already dealt 
with. Thus, a power series is Taylor's series of its sum. 

The coefficients of series (61) being uniquely expressed in terms 
of its sum, we can assert, in particular, that if the sums of two power 
series are identically equal their coefficients in like powers of x 
also coincide. Accordingly, if the sum of a power series is identically, 
equal to zero all its coefficients are also equal to zero. 

In the above argument we have regarded a power series as being 
given. But in practice we usually deal with the reverse problem of 
expanding a given function / (a:). Then, naturally, we encounter 
the problem of determining the range of the values of x for which ' 
formula (61) is valid. On the basis of Secs. IV.15, 16, we conclude 
that this is equivalent to the question what are the values of x for 
which the remainder of the corresponding finite Taylor formula 
tends to zero when the number n increases. 

An exhaustive investigation of the remainder can be carried out 
only in some simple cases. Fortunately, we can easily do without 
such an investigation -when we deal with elementary functions 
because it is possible to prove that formula (61) is valid for an ele- 
mentary function / (x) on every interval in which the series conver- 
ges provided that the direct substitution of x — 0 into the expres- 
sions f (x), /' (x), f" (x), . . . yields the finite results / (0), f (0), 
f" (0), .... This implies, in particular, that the expansions given 

43 * 
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in Sec. IV.16 are valid for the intervals of convergence of the cor- 
responding series. 

It should be remarked that there are many functions that cannot 
be expanded into power series (i.e. into Taylor’s series). For instance, 
a function which is discontinuous in an interval or has a derivative 
of the first or of a higher order with a discontinuity on the interval 
cannot be represented by a power series. The same is true for a func- 
tion which is represented by means of different formulas on different 
parts of the interval under consideration. 

All that has been said here is directly extended to series in powers 
of x — a of form (47) and to corresponding Taylor’s series (IV.53). 

14. Power Series with Complex Terms. These series are of the form 


a 0 -f a t z -f- a 2 z 2 + 


+ a n z n + 




(62) 



where the coefficients a n and the independent variable z can assume 
any complex values. The theory of these series is analogous to that 

of real power series but the inequality 
( z f < R defining the values of z for 
which series (62) is convergent repre- 
sents not an interval but a circle in 
the complex z-plane. The circle is re- 
ferred to as the circle of convergence 
of series (62) (see Fig. 338). Simi- 
larly, for a series in powers of z — a 
where a is a complex number the 
domain of convergence is specified by 
an equality of the form | z — a | <. R 
which defines a circle of radius R with 
centre at the point a. If R = oo the 
series converges throughout the whole 
complex plane. The properties enumerated in Secs. 11 and 12 can 
be transferred to the series of form (62) without essential changes. 
The sum S (z) of such a series is a complex function of a complex 
variable (see Sec. VIII. 11). For these functions the notion of an 
integral is introduced by means of the notion of an antiderivative. 
In Sec. VIII.4 we considered several examples of defining a fun- 
ction for complex values of its argument by means of series of 
form (62). 

As it was mentioned in Sec. VIII.4. the identities which are satis- 
fied by functions for real values of the argument remain valid for 
its complex values when we continue the functions in the complex 
plane by means of the corresponding power series. To illustrate 
what has been said we take the equality 


Fig. 338 


gin (1-4-*) = 1 x 


(63) 
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It is valid for real x, by the definition of the logarithm. Hence, if 
we substitute the power series of y — In (1 + x ) ( see Sec. 12) into 
the series of e v and perform the corresponding identical transfor- 
mations (removing the brackets etc.) we must obtain 1 -f x. Per- 
forming the same operations for the complex values of the argument, 
i.e. writing s in place of x, we obtain 

gin (l + z) — 1 -j- 2 (64) 

which is what we set out to prove. 

This formula implies, in particular, that the definition of the 
logarithm by means of power series (IV.61) into which z is substi- 
tuted for x is coherent with the definition of the logarithm of a com- 
plex number given in Sec. VIII. 5. The sum of the series represents 
only one branch of the infinite-valued function In (1 -f- z), namely 
the branch which yields zero when z = 0 is substituted into the 
function. 

Formula (VIII. 7) and many other formulas can be proved in 
a similar way. 

15. Bernoullian Numbers. The so-called Bernoullian numbers 
discovered by Jacob Bernoulli are widely applied in the theory 
of series and particularly in the theory of power series. These num- 
bers, which we denote by p t . p 2 , (V P^ • • ■> arc defined by a 
symbolic recurrence relation of the form 

(p + l)"+i-P"+i-0 (« = 1, 2, 3, ...) 

in which p ?{ should be replaced by P& after the brackets have been 
removed on the left-hand side. 

For instance, putting n = 1, n — 2, n = 3, in succession, we 
obtain, respectively, 

P 2 -f- 2 Pj -}-l — P 2 = 0 

which yields ft, = — 1 

P3 d - 3 Pa 3 Pi 1 — P3 = 0 
which yields p 2 = - !&+! =1, and 

O D 

p 4 + 4p 3 -f6p 2 + 4p J -f 1— p 4 = 0 
which yields p 3 - — 6 P 2 +*Pi-H r __. Q| 

The subsequent calculations result in 

30’ ^5 = 0, p 6 = — , p 7 ~o, p s =— JL, 

Po-0, p 10 =__ ) p n = 0, Pi2=— 7 ^, Pi 3 = 0, Pj 4 = I-, ... 
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It is possible to prove that all p n with odd numbers re ^ 3 are 
equal to zero. Introducing the notation 


B n = (-1)"- 1 P 2 „ (n = 1, 2, 3, . . .) 

we write 

o p 1_ p _J_ p _j_ p p 691 

£>i — 6 ’ — 30 > D a — 42 > — 30 . -°5 66 . "g 2 730 ’ " ' 


These numbers are also referred to as Bernoullian numbers. 

We now put down some formulas involving Bernoullian numbers. 
For the function £ (a:) (see Sec. 4) we have 


2 (* = 1 , 2 ,...) ( 65 ) 

11=1 

In particular, this yields 

Xi 1 Bi(2n) 2 ji 2 -VI 1 7? 2 (2it) 4 _ rt 4 

rfi~ 2-2! ~ 6 ’ Zj n* ~ 2-4! 90 

n=l n= 1 

Formula (65) shows that all the numbers B n are positive. Further, 
we have 


oo 

lan.z = ^ 

n=l 


2 2n (2 2n — 1) 
( 2 »)! 


B n x 


2H~1 


[compare with formula (58)] etc. 

16. Applying Series to Solving Difference Equations. A difference 
equation connects an unknown quantity and its finite differences 
(see Sec. V.7). We first turn to the case when the unknown quantity 
is represented by a sequence a 0 , a it a 2 , . . ., a n . A difference equ- 
ation defining the sequence can be put down in the general form 
/ (re, a n , A a„, A 2 a„) = 0 (re = 0, 1, 2, . . .) (66) 


if we restrict ourselves to equations of the second order. Here A a n = 
= a n+ i — and A ~a n — A — A a n . Substituting 

A ci n ci n and A d n == d n ^. z -f* 

we reduce (66) to the form 

<P (n, a n , a n+l , a n+2 ) =0 (re = 0, 1,2,.. .) (67) 

In a particular case the left-hand sides of equations (66) and (67) 
may not contain all the arguments put down there. 

To solve equation (67) we can arbitrarily set two values of the 
unknown quantity, for instance, a 0 and a i . Then we find a z by 
putting re = 0 in (67). Further, putting re = 1 in (67) and substi- 
tuting the value a z found above we determine a 3 and so on. This 
step-by-step procedure enables us to find any desired number of 
terms of the sequence a n . 
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A particular case of equation (G7) is a homogeneous linear equation 
with constant coefficients which is written as 

c«7 n + P<z„ +1 +v«n+2 = 0 ( n = °> 2 > P’ y = const) (68) 

The solution of equation (68) can he found in the general form 
hy means of a method which is illustrated below. Equations of any 
order can be solved in a similar way. 

Let us form the so-called generating function of the sough t-for 
sequence whose expansion into a power series of the form 

Q = a 0 + «ix -f- a z x 2 a n x n -f- . . . 


gives rise to the terms of the sequence as the coefficients in the 
series. It is only the coefficients of the series that we are interested 
in here and therefore we aro not going to consider any numerical 
values of x and the question on the convergence of the series. In 
such a case a power series is referred to as a formal power series. 
We can easily find ifie product 

(Y + -f ax 2 ) Q = ya 0 + $a 0 -f ya t ) x + (aa 0 + -f ya 2 ) x 2 + 

-f (a«i + $a 2 + y a 3 ) x 3 -f- . . . 


Equation (68) implies that all the coefficients on the right-hand 
side, from that in x 2 onwards, are equal to zero. Performing the 
division we derive 


i _ Tap +(P«o + yaps 


(69) 


y-j-fiz-f-ax 2 

The values of a 0 and given, we have the ratio of two polyno- 
mials with the given coefficients on the right-hand side. It can be 

decomposed into partial fractions of the form — — — by applying 

( x — a) a 

the methods discussed in Sec. VIII. 10. Each fraction can be rewritten 
as 


A A B 

( _ fl) « (!_£.)“ ” <1-Y*) a 

where B = and y = ^ . We have a — 1 or a = 2 for fraction 

(69) but for difference equations of higher order the values of a 
may be greater. These fractions are expanded into power series 
according to the formulas which are obtained from the formula of 
uie sum of a geometric series by means of differentiation: 

-j~ = B-\-Byx + By 2 x 2 + . . . +By n x n + . . 

(1— yx) 2 y V, 1— yx ) ~~ 

= £-h2Byx + 3By 2 x 2 + . . , -f (n-j-1) By n x n -\- . . . 
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and so on. Adding together the coefficients in x n entering into all 

the above series we thus determine the coefficient a n in the series 
Q — u 0 -f mx -f a 2 x 2 + • - • ~r o. n x n -r ■ • • and hence equation 
(68) has been solved in the general form. 

We suggest that the reader apply the above method to deriving 
the general formula of the so-called Fibonacci numbers a 0 — 0, 
a, = 1, a z = 1, a 3 = 2, a 4 = 3, a 5 — 5, ... each of which, from 
the third onwards, is equal to the sum of the two preceding num- 
bers. The sought-for expression is of the form 

(i -j-d/g)"— (1 — Vl)n 

yo-2* 

We also consider difference equations in which the unknown quan- 
tity is a sought-for function y (x). In this case instead of (66) and 
(67) we have equations of the forms 

f (x. y. A h y, A f,y) = 0 and cp (. r , y (x). y(x-~ h), y{x + 2 h)) = 0 

(70) 

respectively. 

The latter case can he reduced to the former in which the unknown 
quantity is represented by a sequence. For example, let 0 ^ x ■< °°. 
We introduce the notation a n = y (£ -f- nil) (n — 0. 1.2, . . .) 
where c is a constant number lying in the interval 0 ^ c < h. 
Putting x = | T nh. in equation (70) we can rewrite it as 

«P (5 + nh. a n . a n+t , a n+z ) = 0 

and hence, for a constant c. we obtain an equation of form (67). 
After a n has been found we can use the arbitrariness of the choice 
of c and thus obtain the sought-for solution y ( x ). dn particular, 
it follows that we can arbitrarily set the values of y (x) in the inter- 
val 0^i<2 h for equation (70) (why is it so?). 

17. Multiple Power Series. The role of multiple power series in 
the theory of functions of several arguments is similar to that of 
ordinary power series in the theory of functions of one independent 
variable. For the sake of simplicity we shall restrict ourselves to 
double power series. The power series of higher multiplicity are 
treated similarly. 

To put down the expression of a double series it is convenient 
to use the double index notation which was applied to writing 
a double number series in Sec. 6: 

oo oc 

s (x- S 2 a mn x m y n = 

m~0 ji— 0 

= «co-f au>x~ a M y — n 20 x- a u xy a 02 y 2 . . . 


(71) 



SERIES 


681 


If the series is absolutely convergent the order of performing its 
summation does not matter. 

The domain of convergence of series (71) is a region of the x, y-plane . 
(which may coincide with the whole plane in a particular case). 
Such a domain can he of the form represented in Fig. 339. For any 
fixed y we obtain a power series in powers of x whose radius of con- 
vergence R may depend on y. i.e. R = R (//)• Therefore the domain 
of convergence is symmetric with 
respect to the y- axis. The symmetry 
with respect to the x-axis is implied 
by the same argument. In the case 
of absolute convergence of se- 
ries (71), R (y) is a non-increasing 
function of y for y 0 (why?). 

Series of the form 

CO CO 

2 Ea™»(a-fl) m (y-&) n 

m=0 ii— 0 

are treated similarly. The domain 
of convergence of such a series is 
a plane figure with centre of sym- 
metry at the point (a, b). 

The properties of multiple po- Fig* 339 

wer series are analogous to those 

of ordinary power series (see Secs. 11 and 12). In particular, mul- 
tiple power series are obtained in expanding a function of several 
variables into Taylor’s series (see Sec. XII. 6): 



/(*. v)=i ( 0 , 0 ) 


1! 


-x- 


£*(°. o) 


1' x (0, 0) , f' y (0, 0) 

1! 

(0, 0) 


r xx to, o) 


2! 


X 2 + 


t!t! 


xy- 


'vv 


2! 


y-+ *• 


Series (72) are obtained in the same way. Multiple power series are 
applicable when we use the small parameter method (see Secs. V.5 
and XV. 27) for an equation containing several parameters. They 
can also be utilized in many other problems. 

18. Functions of Matrices. Let A be a square matrix (see Sec. XI). 
For definiteness, let it be of the third order (the results that we 
shall obtain here are true for matrices of any 7 order). In Secs. XI. 2 
i we introduced such simplest functions of matrices as A 2 
A _1 ‘ * n wllat sense should we understand an expression 
of the form e A and the like? The importance of the exponential 
function in mathematics indicates the advisability of putting this 
question. 
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It turns out that a reasonable answer to the question can be 
obtained by means of power series if we apply them by analogy with 
Sec. VIII.4 where the notion of a function of a complex variable 
was defined. Suppose that we are given a function / ( x ) which can 
be expanded into a power series 

/ (x) = a 0 + g , x -f a„x 2 + a 3 z 3 + • • ■ + a n x n + • • • (73) 


By definition, we write 

/ (A) = a 0 I -j- CjA -f- a z A 2 + • • • -b a n^ n + • • • (74) 


where I is the unit matrix of the same order as A. For example, 
we have 


. T A . A 2 A 3 

e- — l , u + 2! ‘ 31 



(75) 


The definition makes sense if series (74) converges. We can point 
out a simple condition for the convergence. Let us assume that 
series (73) has the radius of convergence R and suppose, for sim- 
plicity, that all the eigenvalues h 2 , 1 3 of the matrix A (see 
Sec. XI. 4) are distinct. Then, as it was shown in Sec. XI. 8, the 
matrix A can be transformed to the diagonal form, that is there 
exists a non-degenerate matrix H for which 


H- I AH = diag(A J , l 2 , % 3 ) = A 

But this implies 

A = HAH' 1 , A 2 = (HAH 1 ) • (HAH' 1 ) = HA 2 H-\ 

A 3 = A 2 A = (IIA 2 H-') (HAH” 1 ) = HA 3 H _1 
etc. Consequently, series (74) can be rewritten as 

Ha 0 IH -1 + H^AH -1 + Ha 2 A 2 H _1 + . . . = 

= H(a 0 I + a 1 A-fc 2 A 2 -J- ...JH" 1 (76) 

A diagonal matrix can he easily raised to a power: 

A 2 = diag (X\, )*, I]), A 3 = diag (A 3 , ^ 3 , V) (77) 

etc. 


(let the reader verify that in the general case of multiplication of 
diagonal matrices the product is a diagonal matrix whose elements 
are the products of the corresponding diagonal elements of the 
matrix factors). Therefore, we have 

a 0 l 4“ GjA -f- g 2 A 2 

= diag (c 0 -f GjAj + a{k\ + . . . , a 0 -f- -f a z U +... 

• - • , a 0 “h a 1^3 ~i“ G 2 ?dj 


(78) 
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If the series forming the diagonal converge series J76) converges 
as well which leads to the convergence of series (74). 

Thus, we can assert that if all the eigenvalues of the matrix A 
do not exceed R in their moduli [where R is the radius of convergence 
of series (73)] series (74) is convergent and even absolutely conver- 
gent. If at least one of the eigenvalues exceeds R in its absolute 
value series (74) is divergent. It can he proved that the above result 
is also valid for the case when the matrix A has multiple eigenvalues. 

Formulas (76) and (78) also imply the following formula appli- 
cable to a matrix A which can he transformed to the diagonal form 
by means of a matrix Ht 

/ (A) = Hdiag (/(?.,), f(X 2 ), f(l 3 )) H-i 

What has been proved implies that series (75) is convergent for 
any matrix A since the corresponding series (IV. 55) has the radius 
of convergence R — oo. Another important example is the series 

I + A + A 2 + . . . + A n + . . . (79) 

which converges if all the eigenvalues of the matrix A do not exceed 
unity in their moduli (why?). 

Many other properties of ordinary functions can he extended to 
functions of matrices.. These properties can he proved by manipu- 
lating series after a manner of deducing formula (64) from formula 
(63). For example, the identity 

(1 +* + 3* + . . .) (1 -X) =-~ T (1 ~X) = 1 

implies the relation 

(I + A + A 2 + . . .) (I — A) - I 

i.e. the sum of series (79) equals (I — A) -1 if it is convergent. At 
the same time we should take into account that when proving some 
properties by means of series we use a permutation of factors (for 
instance, the relation ab + ba = 2 ab) which may not he applicable 
to matrices. For instance, we apply the above relation to deducing 
tue formula 

e A e B = e A + B 

and thus it is valid for commuting matrices A and B and is inappli- 
cable when they are non-commuting. 

As an example of applying the notions introduced in this section, 
et us establish certain conditions guaranteeing the convergence 
oi the iterative method of solving a system of linear algebraic equ- 
ations. System (VI. 19) can he rewritten in the vector form 

x = Ax + 6 


( 80 ) 
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where 6 is a given vector, A is a given coefficient matrix and x is 
the sought-for vector. Taking an initial approximation x = x 0 
we obtain the successive approximations according to the iterative 
method: 

. Xj = 6-rAx 0 . 

x 2 = 6 4- Ax t = 6 4- A (6 — Ax 0 ) = 6 4- A6 4- A 2 x 0 , 
x 3 = 6 4- Ax, = 6 4- A6 4- A 2 6 4- A 3 x„ 
and so on. Generally, we have 

x n = (I - A - A- - ... 4- A"- 1 ) 6 4- A n x 0 (81) 

For the process to be convergent it is necessary that the initial 
approximations should not affect the result obtained in the limit, 
that is we must have A" >- 0. For this to be so it is sufficient, 

71 — *■ CO 

on the basis of formulas (77), that all the eigenvalues of the matrix 
A be less than unity in their absolute values. It is the last condition 
that guarantees the convergence of the iterative method. If it is 
fulfilled we obtain, passing to the limit in formula (81) as n oo, 
the formula 

x — Iim x n = (I — A — A 2 4- . . . — A n 4- . . .) 6 = (f — A)“ J 6 

n ~>' c 

The direct substitution of the vector x thus obtained into equation 
(80) shows that the vector satisfies the equation. (Perform it!) 

By analogy with vector functions of a scalar argument (see 
Sec. VII. 23), we can consider matrix functions of a scalar argument, 
having the form B = B (a:). Many of the properties of ordinary 
functions can be extended to this case. For instance, we often use 
the function 

B = e Ax ( — oo < x < oo. Ax = xA) 

where A is a constant matrix. Applying the series techniques we 
can readily prove that (e Ax )' — Ae Ax . In particular, it follows that 
we have (e 4r c)' = Ae VY c for any constant vector c. But this means 
that the vector function of x having the form 

y = e V - T c (82) 

is a solution of matrix equation (XV.149) with constant coefficients: 

y' = Ay (83) 

If an initial condition of the form y | r=T = y 0 is given •formula (82) 
implies 

V 0 = e Ax oc, i.e. c = e -Ax oy 0 
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Hence we obtain the following explicit formula for the solution: 


y = 






An arbitrary initial condition being satisfied, formula (82) represents 
the general solution of equation (83). 

19. Asymptotic Expansions. Asymptotic expansions introduced as 
early as Hie 18th century by H. Poincare (1854-1912). a prominent 
French mathematician, are widely applied in modern mathematics. 

I 

We shall consider the expansions in powers of — which are more 
often encountered than the expansions in x. But of course this dis- 
tinction is not essential because the substitution— — x < transforms 

X 

an expansion in — (as x approaches infinity) to an expansion in 

x t (as .Tj 0) and vice versa. For example, from series (IV.55) we 
directly obtain 


«*■ + 


t 

T 


i 

n !. t « 


i 


(84) 


Let us begin with an example. We shall investigate the behaviour 
of the function 

co 

f(x)— f e x ~- s2 ds 

i 

X 

[which is equal to e x * (~ — Erf a;) ; see formulas (XIV. 36') and 
(XIV.72) J for x oo . Applying L ’Hospital’s rule we can readily 
show that / fc) and hence (see Secs. III. 8 and III. 11) we have 

= + (85) 

To specify the expansion we integrate by parts: 





We similarly verify that the last integral is equivalent to 
he. we obtain 

= (^r) 


(86) 
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This as a more accurate expansion than (85) since the term 
o entering into (86) tends to zero faster than the analogous 

term in (85) as x -v oo. Further integrations by parts result in 
still more accurate expansions 

K x ) = + 0 (u) ( 87 ) 

. 1 1,1-3 1-3-5 / 1 \ ,nn> 

f ^ ~ 2x 2 2 x 3 1 23^5 24x1 ' ° \X~ ) 

etc. (Let the reader verify the calculations!) 

One can think that these operations should result in an expansion 
of the function f (x) into the series 

1 1 1-3 1-3-5 , 1-3-5-7 ,q 0> 

2x 2 2 x- J ' 2 s x 5 2 4 x" 25x8 ‘ ‘ * 

but D’Alembert’s test obviously indicates that this series has a zero 
radius of convergence, i.e. it diverges for all x\ Therefore (89) cannot 
be used as an ordinary infinite series but formulas (85)-(88) show 
that we can use its partial sums. 

Now we proceed to give the general definition of an asymptotic 
expansion. A function / (x) is said to have the asymptotic expansion 


/(*)■ 


a l | a 2 

X ‘ 


X n 


(90) 


for x-*- oo if for any n = 0, 1. 2, . . . we have the representation 
/fy) = a 0 + -^-+ — + o (for x -a- co) 


This property automatically holds in the case of a convergent power 
series of form (84). But it can also hold when series (90) is diver- 
gent everywhere or convergent to a function distinct from / {x). 

When applying series (90) we restrict ourselves to a certain number 
of terms and drop all the subsequent summands. Then estimating 
the last of the remaining terms we draw a conclusion as to the values 
of x for which the partial sum thus chosen can be used. Asymptotic 
expansions with alternating signs of its terms are particularly 
convenient because in this case the expanded function lies between 
any partial sum with an even number and any partial sum with 
an odd number. 


§ 4. Trigonometric Series 

29. Orthogonality. Two real functions g ( x ) and h (x) defined over 
a finite or infinite interval a < x < b are said to be orthogonal 
to each other on the interval if 

b 

j g (x) h (x) dx = 0 


(91) 
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Tlie functions are supposed to be finite or infinite but they must have 
absolutely convergent integral (91). Tbe application of the term 
“orthogonality” is accounted for by the fact that formula (91) turns 
out to be in many respects analogous to the condition of the ortho- 
gonality of two vectors given in the form of their resolutions with 
respect to a Cartesian basis (see Secs. VII. 10 and VII. 20-21). 

A system of functions is referred to as being orthogonal on an 
interval if any two functions belonging to the system are orthogonal 
on the interval. One of the most important orthogonal systems is 
the system of trigonometric functions 

1, cos x, sin a:, cos 2a:, sin 2a:, . . ., cos nx, sin nx, . . . (92) 

which is considered on the interval — it ^ x ^ it. To prove the 
orthogonality we compute the integral 




cos nx cos mxdx = 


2 


rc 

j [cos(m — n) x cos (m -f- n) x] dx = 

—a 


1 T sin ( m — n) x 

2 L m — n 


sin (m-j-n) x 
m-j-n 



= 0 


(93) 


(for niy^n) and, similarly, evaluate the integrals 


j sin nx sin mx dx — 0 (iovm^n) 

—n 

and 

-T 

1 cos nx sin mx dx — 0 (for any m, n= 0, 1,2, ...) 

—X 

System (92) is also orthogonal on the interval 0 ^ x ^ 2jt and, 
generally, on each interval of length 2jt. This follows from property 
10 in Sec. XIV. 4 if we take the product of two functions of form (92) 
as / (a:) and put A = 2n. 

If we apply the property of an integral of an even function (see 
property 9 in Sec. XIV. 4) we derive from (93) the relations 


•'t rt 

\ cos nx cos mx dx = 2 [ cosnx cosmx dx = 0 
l-n o 

(m, n = 0, 1, 2, . . m=£n) 

which means that the functions 


1, cos x, cos 2x. 


? • • 


cos nx. 


? • • • 


•(94) 
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form an orthogonal system on the interval 0 ^ x ^ rt. We similarly 
verify that the functions 

sin x , sin 2x. . . ., sin nx, . . . (95) 

also constitute an orthogonal system on the same interval. (Let 
the reader verify that system of functions (92) is not orthogonal 
on the interval 0 ^ x it.] 

Changing the scale along the x-axis by introducing the scale 
factor we transform functions (92) into the functions 


. mlX . 

1, cos—, sm-p, 


cos 


2nx 

~T~' 


sin 


2ux 


cos- 


nzx 


sin 


nnx 

— , ... 


(96) 


which form an orthogonal system on the interval — l ^ x ^ l. 
Applying this technique we can uniformly stretch the intervals on 
which systems of functions (94) and (95) (and, generally, any ortho- 
gonal system) were originally defined. We can also substitute x -f h 
for x where h is an arbitrary constant, i.e. shift the graphs of the 
functions forming an orthogonal system along the x-axis, which 
does not affect the orthogonality of the system (on the shifted inter- 
val). 

There are many orthogonal systems of functions other than the 
trigonometric functions. For instance, let us construct a system of 
orthogonal polynomials on the interval — l^x^l. Take the 
system 

1, x, x 2 , x 3 , . . ., x n , . . . ( — 1 ^ x ^ 1) (97) 


The first two functions of the system are orthogonal to each other: 




=o 


Thus we can put P 0 (x) = 1 and P, (x) == x. But the third function 
in (97) is not orthogonal to the first one (check it up!). To obtain 
a third function orthogonal to the former two functions let us take 
a linear combination of the first three functions of system (97): 
P a (x) — ax 2 -f- fox c. The coefficients a, b . c must be chosen 
in such a way that P 2 (x) be orthogonal to the polynomials P 0 (x) 
and P t (x) constructed above: 



l 

and ^ ( ax- J rbx-\-c)‘X’dx=zQ 
-i 


From this we find (verify the result!): 

6 = 0, a — — 3c, i.e. P 2 (x) = c ( — 3x 2 + 1) 
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The constant c is an arbitrary quantity here. It is usually so chosen 
that P„ (1) = 1. (Such a choice of one of the equivalent objects 

is called the normalization.) Thus we obtain c — — -i-, and finally 

we have 



To construct P 3 ( x ) we take a linear combination of the first four 
functions of system (97), i.e. P 3 ( x ) — ax 3 + bx 1 ~f cx -f- cl and 
choose the coefficients a, b. c, d in such a way that P 3 (x) be ortho- 
gonal to the functions P 0 ( x ), P t ( x ) and P» (a;) already found. 
Applying this condition and introducing the additional requirement 
P s (1) = 1 we obtain, by analogy with the preceding calculations, 



(let the reader verify the result!). Similarly, we find 

P l ( X ) = j (35a; 4 - 30a: 2 + 3) , P & (x) = j (63a: 5 — 70.r 3 - 1 5®) 

etc. 

These polynomials are mutually orthogonal on the interval 
— l^ar^l. They were investigated by Legendre in 1783-1785 
and are called now the Legendre polynomials. The polynomials 
play an important role in various divisions of mathematics and 
physics. 

This orthogonalizalion process which we have applied to system 
of functions (97) on the interval —1 ^ x ^ 1 can also be used for 
any system of linearly independent functions on any interval if 
the integrals of the squares of the functions over the interval are 
convergent. 

21. Series in Orthogonal Functions. Let us be given a sv^stem 
of functions 

Si (®)> §2 C* 2 -)) • ■ •» Sn O 2 ")’ * • • (98) 

orthogonal on an interval a < x < b. We sometimes encounter the 
problem of expanding an arbitrary function / ( x ) defined over the 

same interval into a series in functions (98), i.e. into a series of the 
iorm 


/ (x) ~ a l g l (x) + a 2 g z (x) + . . . 4 - a„g n (x) + . . . = 2 a n g„ (x) (99) 

n— i 

^here a n ( n = 1, 2, 3, . . .) are some numerical coefficients. This 
eaas to the questions whether it is possible to expand any function 

44—0141 
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/ (x), how the coefficients a n can be found and in what way series 
(99) converges. 

For simplicity’s sake, let us consider all the functions and the 
interval a < x < b to he finite. The answer to the first question is 
dependent on the choice of system (98). If expansion - (99) exists for 
any function / ( x ) system of functions (98) is called complete. It can 
be proved that all the orthogonal systems mentioned in Sec. 20 
are complete on the corresponding intervals. 

We now proceed to determine the coefficients a n of expansion (99) 
under the assumption that none of the functions (98) equals zero 
identically. For this purpose we multiply both sides of (99) by 
g n (x) and integrate the result over the interval a ^.x ^.b: 

b b 

j / ( x ) g n (x) dx = aj j gi (a:) g n (x) dx + 

a a 

b b 

-f a z j g 2 (x) g n (x)dx+ ... + a n j gn (x) dx+ . . . 

a a 

By the orthogonality of system (98), all the integrals on the right- 
hand side of the last relation are equal to zero except the integral 
of g"n ( x ), and hence we deduce the formula for the coefficients: 

b 

J / (*) gn ( x ) dx 

a n = — b (» = 1,2,3, ...) (100) 

1 gft (*) dx 

a 

The coefficients being uniquely defined, we conclude, in particular, 
that if the sum of two series of form (99) is identically equal to zero 
the coefficients in the same functions g n (x) in the series are also 
equal and that if the sum of series (99) is identically equal to zero 
all the coefficients are also equal to zero. 

22. Fourier Series. The above general results can be applied to 
concrete orthogonal systems of functions. For instance, taking 
system (92) we conclude that any finite function defined in the inter- 
val — st ^ x ^ jt can be expanded in a series of the form 

/ (x) = a L -j- cos x 4- a 3 sin x + «4 cos 2x + a 5 sin 2x + . . . 

It is convenient to change the notation of the coefficients and to 
put down the series as 

/ (x) = Co + dj cos x + sin x -f- a 2 cos 2x b 2 sin 2x -f- . . . = 
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The coefficients of the series are found by formula (100): 


1 /(«)• 


«o— f 


$ l 2 dx 


—4 ■ 5 /(*)<** 


J /(x)cosnxdx 

-jt 

| COS 2 HX Jx 


n 

■ = j / (x) cos nx 


( 102 ) 


f / (x) sin nx dx n 

b n = -^~ = ~ j / (a:) sin nx dx (n> 1) 

^ sin 2 nx dx — n 

—it d 

The series with respect to systems of functions (94) or (95) are inve- 
stigated in a similar way: 


/(x) = a 0 + 2 a n cosnx (O^x-sCn) 

n=i 

It It 

) = —^f[x)dx, a n = ^ / (x) cos 7zx dx (n;>l) 


(103) 


/(x) = 2 h n sin??x (O^x-^ji) 

n=I 

Jt 

b n = — j / (x) sin nx dx 


(104) 


We also often use series in functions (96) and in functions 
obtained from (94) or (95) by changing the scale along the x-axis: 

CO 

/(*) = «o+2 (a„cos^ + 6 n sin^) (-Z<x<Z) (105) 

n=l 

l l 

a o — ~ 5 i^f(x)dx, a n = -j- ^ / (x) cos cfx, 

-i -i 

i 

bn = -j- J / (x) sin — y— dx («>1), 


44 * 
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/(•z)=a 0 +2 anCOsi r (0<z<Z) 


n=l 


«0 


and 


(106) 


i i 

= -j- j f{x)dx, a„ = ^ / {x) cos dx (n> 1) 


/(i) = 2 bnsin^- (0 <a:<Z) (107) 

72= 1 

l 

b n — -j- j / (-t) sin da; («> 1) 
o 


Series (101), (103) and (104) are special cases of series (105), (106) 
and (107) because the former can be obtained from the latter by 
putting l = re. They are all called Fourier series after the prominent 
French mathematician .T. Fourier (1768-1830) who for the first time 
applied them in his investigations in the theory of heat conductivity 
although such series had been used before. Formulas (102) for the 
Fourier coefficients and some other similar formulas were obtained 
as early as 1759 by Clairaut and in 1777 by Euler. 

Consider some examples of expansions in Fourier series. Let it 
be necessary to expand the function y = x In the interval 0 ^ x ^ l 
in series (106). For this purpose we compute the coefficients: 


and 


(Zp 


= ~\xdx = 4 - 


- 2 f 

ln ~T J 

0 


ant ( 

~T~ 


l . nnx 

— sin — r— 

mi l 


21 


= ^2( C °SJt«-l) = 


This yields 


a f- 


4 1 


a 2 ! 2 


•, a 2 — 0, a 3 — 


2 

i 

sin- 

nnx 

if 

: l x 

Jin 

l 

0 

2 

l 

cos ■ 

jinx ] 

l 

ji n jin 

i 

0 ~ 

21 

_ M 

-( 

-i) n ] i 

a 2 it 2 

i 1 

— . 


41 


n . 


a 2 3 2 
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and finally we obtain 




Jt.T . 1 

COS j f- -go" COS 


3n.r 

~T 


1 

T7T COS 
0- 


5nz , 
~l r 


...)] (10S) 


(0 <.r<i) 

As another example let us take a function which is defined by 
means of several formulas. Namely, let 


/(*) = 


1 for — 1 < x <C — 1 4 * cl 

0 for — l ~r a <Cx<zO 

1 for 0 < a: < a, 

0 for a < x < / 


where a is an arbitrary constant number belonging- to the interval 
0 < a < /. The graph of the function is represented in Fig. 340b 



by the part corresponding to the interval — l < x <. 1. Let us expan d 
the function in series (105): 

1 — 1+a 0 a 

Uo== l T j f( x ) dx== 'W ( J f(x)dx+ j f(x)dx+ | f(x) dx + 

—l —l — f+a 0 

1 —l+i a 0 a 1 

+ J /(*) dx) =- 2 f( j 1 dx+ J 0dx-+ J i dx+ J Odx) = 

a -l -i+a 0 a 

“ ~2i (« -{- 0 + a -j- 0) = -j - , 

1 !• -l+a 

«n=T f /(*)cos^ = 4-( j cos^dx + 

-< -I 

+ 1 ““T*) - i [- »-=™ +sta =] - 

=4(- sta ' : f“s:22+c 0 s22l S m222+sm:22) = 

=- 5 f[(-l)” + llsm^ (n»l) 
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b n ——[ ? sin^ dx-\- j sin^dz^ = 

- i o 

1 r Tin l— 1-4- a) , , . mta -1 

= — — cos — 1 — - — cos ( — nn) -f- cos — 1 J = 

= _^[ C os^cos^-fsm^sin^-(-ir + cos^_l] = 

= {[(-l) n + H cos^-(-l)»-l} = 

= -i r [(— l) n + l] (l — cos^2) (n> 1) 

(Here we have applied the technique which is used for computing 
an integral of any function represented by several formulas.) Thus, 
we have a n — b n = 0 for all odd n and 


1 . 2 knot. 

azh== kK sm ~T~ 


b 2k = j^ (l — Cos^p) (* = 1,2,3,...) 


for even numbers n = 2* (* = 1, 2, 3, . . .). 

The result becomes particularly simple when a = y because 
in this case 

1 1 
a 2 h = - 7 — sin kn = 0, b* = -r— (1 — cos *ji) = 

fiTt ri it 

= 4 r [l-(-l) fe l (* = 1,2,...) 


and thus the series takes the form 



2iu; , 1 . 6 nx . 1 

sin — — (- =• sm sin 


IOjw 

~T~ 



(109) 


Consequently, a function represented by several formulas can be 
expanded in a unique series. The discovery of this fact by Fourier 
was a remarkable event which led to a considerable extension of 
the notion of a function. 

In applying the theory to practical expansions in Fourier series 
we usually apply formulas of numerical integration (see Sec. XIV. 13) 
which are particularly important when the function in question is 
represented by a table or a graph. For instance, suppose that we con- 
sider an expansion in series (107) and want to utilize the trapezoid 
rule by dividing the interval of integration into 24 parts. Then, 

introducing the notation x h = , fu—f ( x h ) (* =0, 1, . . ., 24) 
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obtain 


o p , . . . nnx 

b n = j) f ( X ) sm r 


dx ; 




nnx o 


niurj 


I 


■ / 1 sin — i 


. nnx 24 

-44- sm 


■xzi j __ 


1 (ii-sinO 0 ?i + /i sin 7.5° n 4-/2 sin 15° n+ 

12 \ 2 

+ ... + %-sinl80°n) 


(110) 


e see that for any n it is only the following values of the sine that 
5 needed! 


= 0 . 0000 , 
= 0.1305, 
= 0.2588, 
= 0.3827, 
= 0.5000, 
sin 37.5° = 0.6088, 
sin 45° =0.7071, 


sin 0° 
sin 7.5° 
sin 15° 
sin 22.5° 
sin 30 


sin 52.5° = 0.7934, 
sin 60° =0.8660, 
sin 67.5° = 0.9239, 
sin 75° =0.9659, 
sin 82.5° =0.9914, 
sin 90° =1.0000 


hen annlvintr formula (110) for a given n we must substitute the 
irresponding values of the sine taken from this table by applying 
ie reduction formulas of trigonometry, group together the terms 
aving the same second factor, sum up the values of /* m these 
roups and then, after the multiplication has been performed, com 

ute the whole sum. _ . 

23. Expanding a Periodic Function. Fourier series are used not 
nly for expanding a function defined on a finite interval but also 
or functions defined over the whole axis We first suppose that 
. function / (a) is defined in the interval -n < x < it. Let us 
jxpand it in series (101). The terms of the series are defined not 
inly inside the interval but also outside it and their period is equa 
0 2jt (see Sec. 1.16) because 


cos n ( x + 2it) = cos (nx 4- 2itn) = cos nx 

sin n (x 4- 2n) = sin (nx 4- 2nri) ^ sin nx (n = l, 2, 3, . . .) 

Hence, the sum can also be continued on the whole x-axis and 
is a periodic function of period 2it. But the sum is equal to 7 (a) 
on the interval —it < x;< n and consequently series (101) 
extends the function / (x) from the interval it ^ x ^ it onto the 
whole x-axis with period 2 it. 

Similarly, all the terms of series (103) being even functions, its 
sum results from the extension of the function / (x) as an even 
function from the interval 0 x ^ it onto the whole x-axis with 
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period 2n; similarly, the sum of series (104) is the continuation 
of / (x) as an odd function with the same period. An analogous result 
is obtained when we take series (105)-(107) but the period is natu- 
rally equal to 21. 

Fig. 340 represents the graphs of the sums of the series considered 
in the examples of Sec. 22 which are regarded as being extended 
in the whole x-axis. It should be noted that although 21 is a period 
for the second example it is not the least period which equals l 
in this case. 

Now let a function / (x) be originally defined over the whole 
x-axis as a periodic function of period 2 ji. If we take a series of form 
(101) in which the coefficients are computed by formulas (102) then, 
as we have shown, its sum will yield the periodic continuation of 
/ (x) from the interval — n ^ x ^ ji onto the whole x-axis with 
period 2 ji and thus it will coincide with / (x) on the whole axis. 

The coefficients can also be found by formulas a 0 — 
a+n a.+n 

= 4^ j / (x) dx, a n = — j / (x) cos nx dx and b n = 

a — ji a— 7i 


a-frt 


= ~ j / (x) sin nx dx ( n = 1, 2, . . .) where a is an arbitrary 


2rt 


number. In particular, we can take the formula a 0 = 



u 

and similar formulas for the coefficients a n and b n . This is implied 
by the periodicity of the integrands (see Sec. XIV.4, property 10). 

Similarly, an even (odd) function with period 2 n can be expanded 
in series (103) [series (104)]. For a function of period 21 we obtain 
series (105)- (107). 

Expansion (105) is often transformed by means of formula (1.18). 
This results in 


/(x) = a 0 -|- 2 M n sin (111) 

n=l 

The constant a 0 is equal to the mean value of the function / (x) 
on the whole x-axis (see Sec. XIV. 5) since the means of the other 
summands are equal to zero (check it up!). The first variable sum- 
mand in (111) is called the fundamental harmonic; it has the least 
period 21. The subsequent summands are called higher (upper) har- 

2i 21 2 l 

monies. Their least periods are equal, in succession, to — , y , -j 

and so on. Therefore an expansion of a periodic function in a Fourier 
series is referred to as the harmonic analysis. 



SERIES 


697 


If the independent variable is interpreted as time it is advisable 
to denote the period by T and to rewrite formula (111) in the form 

oo 

/(0 = Q O + S sin M+ a ") ( C0 = 'T _ ) 

72—1 

Thus, a Fourier series represents an arbitrary periodic oscillatory 
motion in the form of a sum of harmonic oscillations with multiple 
frequencies. Such expansions are well known in acoustics where the 
fundamental harmonic 71/ j sin (cot -f- af) determines the fundamental 
tone and the subsequent harmonics are the overtones which determine 
the tone colour. 

The periods of the summands in expansion (112) are commensu- 
rable with one another, i.e. their ratios are rational numbers. This 
is related to the general property that the sum of periodic functions 
with different periods is periodic if and only if the periods of the 
summands are commensurable. A sum of periodic functions with 
incommensurable periods belongs to a wider class of the so-called 
almost periodic functions which have many applications. In parti- 
cular, they are used for investigating superpositions of non-syn- 
chronous vibrations. 

24. Example. Bessel’s Functions as Fourier Coefficients. A function 
of the form 

e iKcos< — cos cos ^ _j_ i sin ( x cos t) (112) 

plays an important role in radioengineering. It is an even periodic 
function with period 2n with respect to t for any fixed x. Therefore 
it can be expanded in a Fourier series of form (103). To obtain the 
expansion we must multiply the series 



hy the series 



After the multiplication has been performed we obtain periodic 
function (112) on the left-hand side. Let us combine the terms on 
the right-hand side which contain the same exponential functions, 
to do this we note that we have the coefficient 


f 4(4) 

i i ( ix \ 2 1 

' 2 ! \ 2 / "(ft + 2)1 


+TT ( 


) (fc + l)! ( 2 ) 


ix 

T 

fc +2 


fc+l 


+ 

(i)' 


i 

!!('<+!)! 


/ x \ ft + 2 . 1 / x \ ft+i 

\ 2 / -r - 2!(fc + 2)! \ 2 j 


• • •) = i h Jh {x) 
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in e ihi and e~ ihi (see Sec. XV.26). Thus, we obtain 


«** cos ' = Jo (*) + S i kj k (*) (* iM + = 

ft=l 


co 


= / 0 (x)- {-2 2 i k Jh(x) COS kt 

h=l 


( 113 ) 


It is Fourier expansion (113) that we need. Separating the real 
and imaginary parts in (113) we arrive at the expansions 

cos ( x cos t) = Jq (x) — 2 / 2 (x) cos 2t -f- 2 / 4 (x) cos 4 1 — ... 

and 

sin (x cos t) — 2J l ( x ) cos t — 2 J 3 {x) cos 3 1 2 J 5 ( x ) cos 5 1 — • . . . 

Formula (113) implies, in particular, the integral representation 
of a Bessel function of an integral order: 

Zl 

J n (x) = ^2- j e ix cos ' cos nt dt 
o 

[check up the result taking advantage of formula (103) for the coef- 
ficients of series (103)]. 

25. Speed of Convergence of a Fourier Series. Let a bounded perio- 
dic function f (x) of period 21 be expanded in Fourier series (105). 
We can easily show that all the Fourier coefficients are bounded 
above in their absolute values by the same positive constant: 

i 

I a n | = j | j / (x) cos dx J < 

-l 

l l 

<4 j |/(x)||cos-^£ |dx<4 j \ f{x)\dx 
-i -i 


The same result is obtained if the function / (x) is unbounded but 
absolutely integrable (summable) on the interval — l ^ x ^ l, 
i.e. if 

i 

j | / (x) | r/x < oo 
-i 


Let the function / (x) be bounded and discontinuous. Then its 


Fourier coefficients a n and b n are of the order of — as re- 


(in 


oo 
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particular, this is the case in the second example considered in 

^Actually, suppose, for definiteness, that / (x) has two disconti- 
nuities in the interval -Z < x < Z at the points x = x ia ndx- x 2 
and that -Z < x 4 < x a < Z. Then we have 

a, = 4- { / (x) cos ^ dx = I J / (x) cos ^ dx + 

-i 

+1 \ f (x) cos dx + 4 J / (*) cos dx 

XI *2 

Let us integrate by parts each of these integrals. 


(*) >ta d*] +... = - ^ 1/ <*. + °) - 
— / (=, — 0)] sin -“[/ ('£;+ 0) - 
-/fe-0)]sin-^— ;M f'{x)sm~dx ( 114 > 

— I 

The last summand on the right-hand side differs from the Fourier 
coefficient of f (x) only in the constant factor in front of the integral. 

i 

But j j f (x) | dx <C oo because the derivative /' (x) retains its sign 

on each interval (a, P) of monotonicity and continuity of the func- 
tion / (re) and hence 

j | /' (x) | dx = j j f (x) dx — j / (P — 0) — / (cc + 0) | <C oo 

a a 


We have excluded from our considerations functions haying an 
infinite number of intervals of monotonicity within a finite interval 
of variation of x (see Fig. 102) because they are rarely encountered. 
We see that under the assumptions concerning / (x) the last integral 
in formula (114) is bounded which implies what has been said about 
the order of smallness of the Fourier coefficients. 

Now let the function f (x) itself he continuous. Suppose that its 
derivative has discontinuities and is hounded (this is the case in 
the first example considered in Sec. 22). Then its Fourier coefficients 

are of the order of 4? as n -> oo. 

72 “ 
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Indeed, taking the expression of a Fourier coefficient and inte 
grating by parts we obtain 

; i 

If,,, nrrr , 1 f ,, , , . nzx , 

an =y j / (x) cos — j dx = — — | / (x) sin-y-tfx 

ii 

Applying the argument of the preceding paragraphs to the inte- 
gral on the right-hand side we conclude that it is of the order of 

— as n~> oo, and hence a n is of the order of —z (the same result is 
11 

similarly obtained for b n ). If / ( x ) and /' (x) are continuous and 
/" ( x ) has discontinuities we can perform the integration by parts 

twice and thus prove that in this 
case the Fourier coefficients are of the 

j 

order of -* and so on. Hence, the 

order of smallness of Fourier coeffici- 
ents depends on the “smoothness” of 
the function in question, i.e. on the 
number of continuous derivatives it 
possesses. The greater the number, 
the higher the order of smallness of 
the Fourier coefficients, that is the higher the speed of convergence 
of the Fourier series of the function. 

The Fourier series of a discontinuous function / ( x ) converges 
very slowly and therefore it is difficult to apply it to practical 
calculations. To overcome the difficulty we sometimes try to con- 
struct a function cp ( x ) having discontinuities at the same points 
as / ( x ) and the same jumps (see Fig. 341). It is advisable to choose 
a function cp (x) whose structure is as simple as possible. After 
<p (x) has been chosen we form the difference f (x) — cp (x) which 
no longer has discontinuities and therefore is expanded in a Fourier 
series whose speed of convergence is higher than that of the Fourier 
series of / (x). Consequently, we represent / (x) as the sum of a simple 
function cp (x) and a Fourier series with a higher speed of conver- 
gence. We can similarly eliminate the discontinuities of the first 
derivative and so on (compare this with the methods used in Sec. 4). 

Thus, if a function / (x) is continuous and has a bounded deri- 
vative the order of smallness of its Fourier coefficients (as n -*■ oo) 

is not less than that of ^ . But according to Sec. 1 


cc 



n=l 


and hence, by Weierslrass’ test (see Sec. 8). the Fourier series uni- 
formly converges on the whole x-axis. A more extensive investigation 
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shows that the condition which we have imposed on /' (x) is unne- 
cessary because Weierstrass’ lest can he applied under some more 
general assumptions. In the case of a continuous function the sub- 
stitution of any numerical value of x into the series exactly yields 
the corresponding value / ( x ). 

If a function / (x) is discontinuous its Fourier series by no means 
converges uniformly because its terms are continuous functions 
(see property 1 in Sec. 9). It can be shown that in this case the sub- 
stitution of a numerical value of x into the series results in / ( x ) 
at all the points of continuity of / ( x ) and in 

/(■»- 0 ) 1 - /(*•-*- 0 ) 


at the points of discontinuity. For instance, series (109) has the 
sum equal to y for x — 0 since in this case / ( — 0) — 0 and / (+0) = 1. 

If a function defined in a finite interval is expanded in a Fourier 
series (see Sec. 22) the speed of convergence of the series is specified 
by the discontinuities of the function and of its derivatives which 
occur after the function has been periodically extended onto the 
whole rr-axis as it was described in Sec. 23. For instance, the Fourier 
coefficients in the first example considered in Sec. 22 are of the 

order of — since the extension (see Fig. 340o) results in a func- 
tion whose first derivative has discontinuities. If the same function 
is expanded into a series in sines [series (107)1 the coefficients are 

of the order of ~ because after the extension has been performed 

the function itself has discontinuities (why?). 

Fourier series enable us to find the sums of many interesting 

numerical series. For instance, if we substitute x — -j- into series 
(109) we obtain 


1 

2 ‘ jt 

/ 1 . n , 1 . 3.i . 

It sm sin_ r+ 

i.sin42-+...) 

which implies 

= ± + ±(±-.± + 1. 

-■) 


1 L_lJ_ Li 

1 3 ‘ 5 7 "r • • • 

Jt 

~ T 


If we substitute x=0 into series (108) we get 
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i , -it 

12 "r 3 2 1 5- ' 7 2 T " ■ 8 


Further, the last result makes it possible to find the value £(2) 
of the zeta function [see formula (19)] : 


£( 2 ) = l^ + -'ir + -]^+-4r+l^' + -gr+ ••• + 


+ 22 \ 12 “r 2 2 ' 3 2 ‘ / 8 ‘ 4 

which results in 

£ (2) = -£=1.645 


26. Fourier Series in Complex Form. Applying Euler’s formulas 
we can pass from a Fourier series containing trigonometric functions 
to a series in exponential functions which is sometimes preferable. 
To perform such a transformation of series (105) we can take the 
formulas 

inn* in nx 

1 +« ' 

2 


Jinx 

COS — — =- 


and 


sin 


jinx 

l 


innx inn* 



2 i 


After the substitution is performed we combine similar terms on 
the right-hand side and thus obtain a series in which the summation 

innx 

is extended over all the exponential functions of the form e 1 
(n = 0, ±1, ±2, . . .). Thus we obtain 

oo innx 

f{x)= 2 c n e~ (115) 

n=— oo 

where c n ( n — 0, ±1, ±2, . . .) are some complex coefficients.* 

innx 

To find the coefficients c n we multiply both sides by e 1 , 
for a fixed n, and integrate the result from — l to l. The integrals 


* A series of this typo in which the summation index runs from — oo to co 
is sometimes referred to as a two-way series. — Tr. 
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of the terms with numbers different from n are equal to zero: 

twl (m—n) 

1 irnnx innx j 

\c m e 1 e~ 1 dx = c m - e — { -- _^ 1 |-i = 

-i 

__ Cm l — 

in (m — n) l 

= — . 2c gl — sin n (m — n) = 0 (nz. ^ n) 
n (m — n) ' 

The integration of the term having the number n results in 2 lc n . 
Thus, the coefficients in series (115) are computed by the formula 

' innx 

c „ = jrjf(x) e 1 dx (n = 1,2,.. .) (116) 

-i 

Let us discuss the general case of an expansion with respect to 
an orthogonal system of complex functions. Two complex functions 
g ( x ) and h (x) are said to be orthogonal to each other on the interval 
a ^ x ^ b if 

b 

j g (x) h* (x) dx — 0 (117) 

a 

where the asterisk designates the complex conjugate function (see 
Sec. VIII. 3). In a special case when g and h are real this definition 
coincides with former definition (91). It should be noted that if 
we pass from (117) to the complex conjugate expression in both 
sides we obtain 

b b b 

( j S( x ) h* (x) dx'j = | (g (x) h* (x))* dx= j h (x) g* (x) dx = 0 

a a a 

which means that the orthogonality condition is independent of 
the order in which we enumerate the functions, that is if g is ortho- 
gonal to h it follows that h is orthogonal to g, and hence we can 
speak about the mutual orthogonality of the functions. 

If we have a system of complex functions 

8i ( x )i 8i (x), ...» g n (x), . . . (118) 

which are orthogonal on an interval a ^ x ^ b and if a complex 
function f (x) can be expanded into a series in these functions [this 
is always the case when system (108) is complete] we obtain 

oo 

/ (*) = C lg i (x) + Ctfa (x) + . . . + Cjign (*)+...= 2 c n g n (x) (119) 

71 = 1 
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To determine the coefficients we multiply both sides by g* (x) for 
a fixed n and integrate the result from a to b. By the orthogonality 
condition, only one term on the right-hand side will be different 
from zero and hence we obtain the formula 


b 

| / (r) Sn (■*) dx 


\ gn (*) gn ( x ) dx 
a 


b 

| / ( x ) Sn ( x ) dx 

a 

“ 1 > 

l I gn. {*) i 2 dx 
a 


Expansion (115) is a particular case of (119) when the system 
of functions 

i2nx itix inx i2xx 

. . e~ 1 , e 1 , 1, e 1 , e 1 , . . . 

(which is complete on the interval — l ^ x ^ l) is taken as system 
(118). The expansion is valid for any bounded function and' even 
for an unbounded complex function / (.r) which is absolutely inte- 
grable over the interval (see Sec. XIV. 16). 

27. Parseval Relation. Let us return to real functions. Take an 
orthogonal and complete system of functions on an interval a gC 
^ x ^ b: 

8i (z)> g 2 (z), . . ., g n (x), . . . (120) 

Let us consider the expansion of an arbitrary function / ( x ) into 
a series in these functions. Squaring both sides of the expansion 
and integrating the result from a to b we arrive at the integral 

b 

j f 2 (z) dx 


on the left-hand side. We shall suppose that the integral has a finite 
numerical value. After the right-hand side is squared we obtain 
the sum of the squares of the terms and the products of different 
terms taken pairwise. The integrals of the latter are equal to zero 
according to the orthogonality of functions (120) whereas the inte- 
grals of the former are different from zero, and thus we have 

b oo b 

j f ( X ) dx = 2 «n f gn (^) dx (121) 

a 7i=i a 

This formula is referred to as the Parseval relation (Parseval 
theorem). In the particular case of a Fourier series of form (101) 
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we obtain 

rt °° 

j f~ ( X ) dx = 2jt«o Jl 2 + bn) 

-n n=l 

and for series (105) we get 

Z CO 

f f (x) dx = 2lal-\-l 2 (a n n+bl) 

-l n=l 

The relations were found in 1805. By the way, they directly imply 
that a n -> 0 and b n 0 as n oo . 

If a system of type (120) is incomplete it is possible to prove that 
it can be completed. After the completion, relation (121) becomes 
true. But all the terms on the right-hand side are non-negative and 
hence if a system of orthogonal functions is not complete we have 
the inequality 

co b b 

2 °n j &n (*) j f 2 {x) dx 

71=1 a a 

for any function / (x). The sign of equality occurs here only for 
those functions / ( x ) which can be expanded with respect to system 
of functions (120).* 

Parseval’s equality (121) enables us to apply a new approach to 
constructing a series in orthogonal functions. Let us be given a finite 
number of functions 

gi (*)> g 2 (x), • • •» g n (x) ' (122) 

which are orthogonal on an interval a ^ x ^ b. We now pose the 
following problem: it is required to form a linear combination 
of functions (122) whose mean square deviation from a given func- 
tion / (x) (see Sec. 7) is minimal. 

To solve the problem let us consider functions (122) to be a part 
of a complete orthogonal system of form (120). Then 

n co n 

/ (*) — 2 Chgk (x) = 2 a hgh (x) — 2 Chgk (x) — 

k=l fc=l fe=l 

71 oo 

= 2 — Ck)gh ( X ) + 2 a hgk ( X ) 

fc=l ft=n- f-1 

where a h are the Fourier coefficients of the expansion of f (x) with 
respect to functions (120) and C k are arbitrary constants. By equality 


* The last relation is referred to as Bessel’s inequality. — Tr. 
45-0141 



706 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


(121), we can write 

bn n b 

| [/(*)— ^Chgk(x)j dx= 2 ( a h — C h ) 2 j gl(x)dx- L 

a h=l Ji=i a 

co b 

+ 2 j g ~ k w dx 

fe=n4-l a 

The last formula indicates that if the coefficients C u C 2 , . . ., C„ 
are varied in an arbitrary way the minimal value of the right-hand 
side is attained when 

C\ C 2 • • M Cji === 

Thus, to obtain a linear combination of a fixed number of ortho- 
gonal functions (122) whose mean square deviation from / ( x ) is 
minimal we must take the corresponding partial sum of the expan- 
sion of / (x) into a series with respect to the system of orthogonal 
functions (120), that is the linear combinations with the coefficients 
defined by formulas (100). 

An analogous argument applied to a complete orthogonal system 
of complex functions of type (118) and to series (119) leads to the 
formula 

b oo b 

j I / (*) | 2 dx = 2 I fl n | 2 | | gn (x) j 2 dx (123) 

a n— 1 a 


if we take into account the equality aa* = | a | 2 . 

28. Hilbert Space. We now return to real functions defined on 
a finite interval a ^ x b. It turns out that there exists a far- 
reaching analogy between such functions and vectors which we 
mentioned in Sec. 20. Since we can perform linear operations on 
the functions according to ordinal algebraic rules the functions 
form a linear space in the sense of the definition given in Sec. VII. 17. 
Moreover, if we introduce the notion of a scalar product of two 
functions by means of the formula 

b 

(/> <?)= j f(x)g(x)dx (124) 


we can readily verify that all the axioms enumerated in Sec. VII. 20 
are fulfilled and thus we obtain a space which is not only linear but 
Euclidean as well. We include in this space all the bounded functions 
and also all the unbounded functions with integrable ( summable ) 
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res, that is such functions / (x) that 

b 

(/, /) = J f (x) dx < oo (125) 

a 

mple inequality 2 1 / (a:) g (x) |< f (a) + £ 2 (*) implies that, 
a (124) of a function satisfying condition (125) is convergent 
led it is an improper integral). 

set of functions satisfying condition (125) equipped with 
product (124) is called the Hilbert space L» after the famous 
n mathematician D. Hilbert (1862-1943). This is an infinite- 
sional Euclidean space (see Sec. VII. 20). By definition (124). 
ion (91) is nothing but the orthogonality condition for vectors 
;ing to this space. A complete orthogonal system of functions 
irthogonal basis in the space L„. It should be noted that when 
ng the notion of a complete system of functions (see Sec. 21) 
space L„ we must interpret the convergence of series (99) 
convergence in the mean square (see Sec. 8). Thus, the con- 
ice in Lo is the convergence in the mean square. Formulas (100) 
coefficients of an expansion are a particular case of formulas 
9) and the orthogonalization process described in Sec. 20 
ling but a realization of the process discussed in Sec. VII. 21. 
sr, Parseval’s equality (121) in the Hilbert space L„ is ana- 
to Pythagoras’ theorem: the square of the diagonal of a rec- 
ar parallelepiped is equal to the sum of squares of all its 
sions. (We suggest that the reader try to interpret geometri- 
the property of partial sums of series (99) which, as it was 
L in Sec. 27, minimize the mean square deviation.) 
liaracteristic feature of a Hilbert space is that it is infinite- 
sional. This property makes it difficult to test the completeness 
orthogonal system of functions. A system of k pairwise ortho- 
nonzero vectors belonging to an n-dimensional Euclidean space 
iplete if k — n and incomplete if k < n. In contrast to it 
inite orthogonal system containing infinitely many functions 
png to an infinite-dimensional space may not be complete, 
the number of functions does not enable us to answer the 
on whether a given orthogonal system is complete in the 
f an infinite-dimensional space. This is a rather difficult pro- 
and we shall not consider it here. 

an be proved that any incomplete orthogonal system is a part 
implete orthogonal system. Therefore a function can be expan- 
uth respect to the former if and only if its expansion with 
t to the latter has zero coefficients in those functions of the 
n which are not contained in the former. 

Sec. VII. 20 we proved that for Euclidean spaces there is an 
•tant inequality of form (VII. 26). Applying it to a functional 

45 * 
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space and squaring we deduce the inequality 

b b b 

( \ f( x ) 8 (*) dx} < ( ( f {x) dx ) ( j g 2 (x) dx ) 

« a a 

Substituting | / | for / and 1 for g into the inequality we get 

& & 

( j |/0r)|<fz) 2 <(h — a) j f 2 (x)dx 

a a 

This implies that all the functions belonging to £ 2 are surnmab 
and that convergence in the mean square implies convergent 
in the mean*. The converse may not be true. By the way, 
is possible to prove that the Fourier series of any summable fun 
tion converges to the function in the mean. 

29. Orthogonality with Weight Function. When we integral 
a function / ( 2 ) over an interval a ^ x ^ b all the values of x b 
longing to the interval are equivalent. But if we want to stre: 
the importance of a certain value of x in comparison with the othei 
we introduce a weight function (weighting function) p (x) 0 whe 

performing the integration: 

b 

J flix) P (x)'dx 

a 

The function p (x) is so chosen that its values should be greater f< 
the values of x which are considered to be more important. 

Two functions g {x) and h (x) are said to be orthogonal with weigl 
function p (x) on an interval (a, b) if 

b 

j f{x)g (x) P (x) dx = 0 

a 

The whole theory of series in orthogonal functions presented i 
Secs. 20, 21, 26 and 27 is directly extended to functions orthogom 
with weight function; to do this we must simply introduce tl 
factor p (x) under the sign of integration in all the formulas. 

An integral involving a weighting function p (%) can be easil 
transformed to an integral without a weighting function (i.e. wit 
the weighting function p = l) by means of change of independei 


5 ,f Convergence in the mean is understood here as ^convergence in tl 
mean of order one; see footnote on 'page 663. — Tr. ‘ 
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variable: 


p {x)dx = dx, x= j p(x)dx = x(x), x(a) = a and x(h) = b 

_ (126) 

Indeed, if we introduce the notation f (x) = f (x (x)) — / (x) wo 
obtain 

b b 

j / (x) p (a:) dx = ( / (x) dx 
a 5 

Under such a transformation any system of functions orthogonal 
with weight function p turns into a system of functions orthogonal 
in the sense of our former definition (Sec. 20). But nevertheless it 
is sometimes convenient to consider functions orthogonal with 
weight function without transforming them according to formula 
(126). 

Conversely, change of variable (126) enables us to pass from any 
orthogonal (in the ordinary sense) system of functions to a system 
orthogonal with weight function p. Such a transformation yields the 
corresponding transformation of the expansions in series and there- 
fore a complete system is transformed to a-complete one. For instance, 
taking the complete orthogonal system of functions 

1, cos x, cos 2x, . . cos nx, ... (0 ^ x ^ n) 

[see system (94)] and performing the transformation 

— i 

x = n — arc cos x, dx — —r = - dx ( — l<]x<^l) 


x = n — arc cos x, dx- 
we obtain the functions 


1 / 1 — - 


T o (as) = 1, Ti (x) = cos arc cos x = x, 

T z (x) = cos (2 arc cos x) = 2x 2 — 1 , 

T 3 (x) = cos (3 arc cos x) = 4x 3 — 3x, . 

T n (x) = cos ( n arc cos x), ... 


after an inessential change of the signs has been made (check it up!). 

These polynomials were introduced by Chebyshev in 1857. They 
are referred to as Chebyshev’s polynomials. We see. that they form 
a complete system of functions orthogonal with weight function 

yi=== on the interval — 1 ^ x ^ 1. The polynomials can also 

be obtained from system of functions (97) by means of the ortho- 
goualization process with this weight function (let the reader ner- 
iorm the transformation!). 
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To eliminate a weight function we can also apply the following 
method: if we are given a system of functions 

8l {z)i §2 ( a: )> • • •> Sn (•*•)’ • * • ( 127 ) 

orthogonal with weight function p (a;) on an interval a < x < b 
the system of functions 

g, (x) Y^p(x), g 2 (x) Y p (x), . .., g n (x) Vp{x), ... (128) 

is orthogonal without weight function on the same interval (check 
it up!). To expand a function / ( x ) in a series with respect to system 
(127) it is sufficient to expand the function f {x)Y? {z) with respect 
to system (128) and then cancel out the factor Y p (x). 

30. Multiple Fourier Series. When expanding a function of several 
independent variables in a series we usually take a system of func- 
tions dependent on several indices whose number equals the number 
of the arguments. Then we arrive at multiple series as in Sec. 17. 

The theory of multiple series with respect to systems of orthogonal 
functions is developed by analogy with the theory of ordinary series. 
For the sake of simplicity, we restrict ourselves to the case of func- 
tions of two arguments. In this case functions forming a complete 
orthogonal system in a domain D must depend on two indices, 
that is have the form y mn (x, y) where m and n assume some 
discrete values. A series with respect to such a system is of the form 

/ (X, y) = 2 «mn<[>mn (z, V) 
m, n 

where the symbol 2 denotes the corresponding two-fold sum. The 

m ,n 

coefficients are found after a manner of Sec. 21: 


5 X / (*» !/) <Pmn (z, y) dx dij 

_D 

1 1 fPrnn (z, y) dx dij 

D 


(129) 


There is a method of constructing a system of orthogonal functions 
of several arguments on the basis of some given systems of functions 
of one argument. Let us be given two complete orthogonal systems 
of functions of one independent variable: 

S 1 i.x) i g 2 {z ) , * * > , gn [z ) ) - • • {g ^ z ^ b ) 

and 

h ( x ), h 2 (a:), . . ., h n (x), . . . (c ^ x < d) 

Then the system of functions 

<Pmn (z, y) = g m (z) K ( y ) (m, n = 1,2,...) (130) 
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is orthogonal and complete in the rectangle II: a < x < b, c < 
'^Tke'orthogonality of the system is implied by the relation 

b d 

j ( (pmn (*. V) <Prnn (*> V) dx d V = j dx j (a?) Oj) g^ (x) h- (}/) dy = 

II a c 

b d 

= ( J gm (X) g^ (x) dx) ( j hn (y) h - (y) dy ) 


which always yields zero except the case when we simultaneously 
have m — m and n = n. The completeness can also be easily proved. 
Namely, given an arbitrary function / ( x , y) defined in II, we can 
expand it with respect to functions h n ( y ) for any fixed x: 


/ (s, y) = S A n (a:) h n (y) 

71=1 

where the coefficients of the expansion are dependent on x. Now we 
can expand these coefficients A n ( x ) with respect to the functions 
g m (x) which results in a double series with respect to system of 
functions (130): 

CO CO 

f (x, y) = S 2 a mn gm (x) h n (y) 

m=l n=l 

Thus we have obtained what we set out to prove. Hence the expan- 
sion is possible. 

Taking two systems of functions of form (95) defined on extended 
intervals 0 ^ x ^ Z t and 0 ^ y ^ l z wo can form a complete ortho- 
gonal system of functions on the corresponding rectangle: 

<Pmn (a:, i/) = sin-^—sm—p- (m,n = 1,2, ...) 

An expansion with respect to this system has the form 

CO CO 

/(*•?)= 2 2«mnSin^sin^f(0<a:<Z 1( 0 <y<Zj) 

m= 1 n=l 

The coefficients in the series are found on the basis of formula (129): 

Cmn j dx j f ( x > U) sin ~ i — si n -7^ dy 

0 0 12 

(verify the result!). 

^PPUcation to the Equation of Oscillations of a String. Fourier 
- nes ave many applications in the theory of equations of mathe- 
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matical physics. Here we shall give an example of such an appli- 
cation to the problem of solving the equation of small free trans- 
verse oscillations of a taut string.* 

We shall consider the string to be weightless. Suppose the string 
has a finite length l. Let us draw the rr-axis along the string in its 
equilibrium state so that the ends of the string have the coordinates 
x — 0 and x = l, respectively. We shall study plane oscillations 
and denote by u (x, t) the transverse deflection of the point of the 
string with abscissa x at the moment t from the equilibrium state. 
In the theory of equations of mathematical physics it is proved that 
the function u (x, t) satisfies the following partial differential equa- 
tion of the second order: 


d 2 u 2 d 2 u 

~W~ a Ihfl 


(131) 


Here a is a constant 




where 


T is the tension of the string 


and p = const is its linear density). We shall consider the ends of 
the string to be fixed. Then the corresponding boundary conditions 
expressing this fact can be written in the form 


u [ x==0 = 0, u | K== f = 0 (for all t) (132) 


We shall also suppose that the deflections and velocities of the 
points of the string at the initial moment of time t — 0 are known. 
Then we can write the initial conditions of the form 


u |i =0 = cp (x), -|r| (=0 = ■>!>(*) (133) 

where <p ( x ) and ip ( x ) are some given functions. Hence, we arrive 
at the following mathematical problem: it is required to solve 
equation (131) for boundary conditions (132) and initial conditions 
(133). 

To solve the problem let us look for the sought-for solution in 
the form of a series of type (107) for each fixed t ^ 0. Then the 
coefficients will be dependent on t and thus w r e get 


u (x, t) = 2 b n ( t ) sin (134) 

n=i 

To find the coefficients b n ( t ) we substitute this expression into 
equation (131). This results in 

CO oo 

... . mix yi t /J% n 2 n~ . nnx 

b n (t) sm-j-=-a ! 2i b n (*) -p- sm 

n=l n=l 

* Equations of mathematical physics (and, in particular, the application 
of Fourier series to solving the equation of oscillation of a string) are treated in 
greater detail in the Appendix at the end of this English edition. — Tr. 
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that is 


&(0 = 


bn (t) 


Solving this ordinary linear differential equation with constant 
coefhcfents by applying the methods given in Sec. X\.l/ we 
derive 


(A 


By (134), it follows that 

B -S(4,cos-^i + 5»sin-^i)sinIf* (135) 

n=l 

To determine the coefficients A n and B n (n = 1, 2,3, ...) we take 
advantage of initial conditions (133). This yields 

<p(*)= 2 -4 n sin-T-2, ^(0= S B n -^sin^x (0 <ar<7) 

71=1 71=1 

(check up the result!). We have arrived at Fourier expansions 
of form (107) from which we find 

i i 

^ n= T J <V V) sm dx and B n = -~ \ ty(0 sin-^- dx 

o o 

Substituting these quantities into (135) we thus obtain the sought-for 
solution. The boundary conditions are automatically satisfied here 

because of the properties of the functions sin^p {n = 1, 2, 3, . . .) 
which satisfy conditions (132). 


§ 5. Fourier Transformation 

32. Fourier Transform. Let us take formula (115) which repre- 
sents any finite function f (x) defined over an interval — Z<2<Z 
in the form of a complex Fourier series. This representation 
involves complex harmonics, i.e. the functions e' hx with wave 
numbers k = k n where 

*n = V (»=.-., -2, -1, 0,1,2, ...) (136) 

The set of these numbers (it is depicted in Fig. 342) is called the 
spectrum of wave numbers. It is discrete, that is it consists of separate 
points to each of which there corresponds a harmonic e iftn:r in expan- 
sion (115) with complex amplitude c n . The quantity c n ( n — 0. ±1, 
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±2, . . .) is defined by the formula 

i 

C n =-gr j j(x)e~ iknX dx (137) 

-i 


which is implied by formula (116). 

For the sake of simplicity, we first suppose that the function 
/ (x) identically vanishes outside a finite interval a ^ x ^ b. 



k-j k 0 k f kg 
0 


]in 

L 


k 


Fig. 342 


Introduce the notation 


fW-iz i ih * dx 


(138) 


This integral is in fact taken only over the interval a ^.b. 
If l is sufficiently large we can rewrite formula (137) of a Fourier 
coefficient in the form 

b co 

Cn ~l"5r j e ~ ihnX d x = -7^ j f (x) e- ihnx dx~- f (k n ) Ak (139) 

a — oo 

where Ak = y is the distance between the neighbouring points 

representing the wave numbers in the spectrum [see formula (136) 
and Fig. 342]. Formula (115) representing / (a:) can therefore be 
rewritten as 

/ (*) = S c ne lhnX = 2 / (k n ) A k (-l<x<l) (140) 

n n 

Now suppose that l is very large. Then the spectrum becomes very 
“dense’’ and Ak very small. In the limit, as l -*■ oo, sum (140) which 
is an integral sum turns into the corresponding integral, i.e. we 
obtain 

OO 

f( x )= [ /(fye'^dk (— co <z< o o) (141) 

— OO 

In this representation the wave number k runs over all the values 
ranging from — oo to oo, i.e. the discrete spectrum of wave numbers 
turns into a continuous spectrum in the limiting process as Z— v oo. 
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The first equality (139) shows that tne amplitudes c n -> 0 as 
/_*. oo. This means that in the limit each harmonic has a zero ampli- 
tude. But at the same time if we take an infinitesimal (but different 
from zero) interval of wave numbers between A* and k -f dk we 
obtain the amplitude 

dc = f (A) dk (142) 

corresponding to the interval. Hence, the amplitude turns out to 
be distributed over the whole continuous spectrum of wave numbers, 
in the limit. This resembles the transition from a discrete model 
of a material body to its continuous model. Actually, in this transi- 
tion we assume that the mass of each separate point is equal to 
zero and thus the total mass becomes continuously distributed over 
all the points with a certain density. By analogy, we can say that 
formula (142) describes the distribution of amplitudes of harmonics 
with density / (A*). Thus, / (A*) is the density of the amplitude on an 
infinitesimal interval of wave numbers. The density is related 
to unit measure of length of the interval [/ (k) is also referred 
to as the spectral density of the function / (x)J. 

Formulas (138) and (141) express the so-called Fourier transfor- 
mation*. Formula (13S) defines the direct transformation and for- 
mula (141) the inverse transformation. We have deduced the for- 
mulas under the assumption that the function / (x) is identically 
equal to zero outside a finite interval. Such functions are called 
finite**. A more extensive investigation shows that the formulas 
remain valid when integral (138) is understood as an improper inte- 
gral. For the integral to be convergent, it is sufficient to impose 
the additional condition 


^ | / (x) | dx < oo (143) 

— oo 

Thus, to each function / (x) satisfying condition (143) there corres- 
ponds its Fourier transform / (k) [which is the result, image, arising 
from the Fourier transformation applied to / (x)]. the transformation 
being defined by formula (138). Conversely, the function / (x) is 
expressed in terms of its Fourier transform by formula (141) and 
is called the Fourier inverse transform [i.e. the inverse image, pre- 

• . * expression on the right-hand side of (141) is also called the Fourier 
integral.— Tr. 

** The term a ‘‘finite function” should not be confused with the term a u bound- 
eu lunction ’ whose range is contained in some finite interval (although we 
Tic m fi in l es sa £ " finite r i ns l ea d of ‘“bounded"). To avoid the confusion we can 
me tiie tenn a function of finite support" when speaking about functions identi- 

• jamshmg outside an interval, the term taken from functional analv- 
sis. — Tr. 
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image, which is the result of the Fourier inverse transformation 
performed on the function / (k)]. 

If / ( x ) is an even function we can use property 9, Sec. XIV.4, 
when computing integral (138), which yields 

oo oo 

f (ft) = ~ j / (■£') c os kxdx — ~ | / ( x ) sin kx dx = 

- 00 — co 

oo 

= — j } (x) cos kx dx 
o 

It follows that / (k) is also an even function in this case and hence, 
on the basis of formula (141), we obtain 

oo 

/ (x) — 2 j / ( k ) cos kx dk 
u 

These formulas define the so-called Fourier cosine transform and 
its inverse image. Similarly, if / {x) is an odd function we arrive 
at the formulas defining the Fourier sine transform and its pre-image: 

oo oo 

if (k) = — j / (x) sin kx dx, f (x) = 2 j if ( k ) sin kx dk 
o o 

By the way, in the last case the term “Fourier sine transform of 
a function / [x)" is usually applied to the function if ( k ) instead 
of 7 (ft). 

If a function f (x) is originally defined on the positive semi-axis 
0 <C x <; oo it can be extended onto the interval — oo < x < 0 
either as an even or as an odd function. Therefore considering both 
x and k to be positive we can use the cosine transform as well as 
the sine transform. But the images of these transformations will 
be different in the general case. 

Let us take an example. Suppose / ( x ) is an even function equal 
to 1 on the interval — 1 < x < 1 and identically vanishing outside 
it. Then by the formula of the Fourier cosine transform we get 

1 co 

/ (ft) = ^ 1 • cos kx dx | 0 • cos kx dx'j — 

o t 

Applying the inverse transformation we obtain 

OO 

/ (x) = 2 ^ ~j~ c °s kx dk 
o 


( 144 ) 
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As in the case of a Fourier series (see Sec. 25), if we substitute nume- 
rical values of x into the formula of the inverse Fourier transform 
we get the corresponding values of / (a;) at all the points of continuity 

of / and the values i- [/ (x — 0) + / (x + 0)] at all points where / 

has a finite jump. In particular, substituting the value x — 0 into 
(144) for which the function / is continuous we get 

1 = 2 j die, which implies ^ dk — 
o ■ o 

An integral formula equivalent to the formulas of the Fourier 
transforms was obtained by Fourier in 1811. 

33. Properties of Fourier Transforms. A Fourier transform possesses 
many useful properties. We are going to enumerate some of them 
here. First of all, it is clear that a Fourier transformation can be 
interpreted as an operator (see Sec. XIV. 26) for which the function 
/ ( x ) is the inverse image (pre-image) and the function / ( k ) is the 
image. 

1. The Fourier operator is linear, that is 


(/i + /2)=/i+7 2 , af = af (a = const) (145) 

This is directly implied by formula (138) and by the fact that the 
integration is a linear operation. 

Formula (145) implies, in particular, that if / depends not only 
on x but also on a parameter t the function / is dependent on the 
parameter as well, and we have 

fl+M — ft _U+At~1t 

A t - A / 

Passing to the limit, as At— >0, we obtain 

J? df_ 

dt dt 

Consequently, the derivative with respect to a parameter of the 
Fourier inverse transform is transformed into the derivative with 
respect to the parameter of the Fourier transform. By the way, 
the method of proving the property indicates that this property is 
common to all the linear operators. 

2. The differentiation of the function / with respect to x results 
in the multiplication of its Fourier transform f by ik. Indeed, the 



718 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


Fourier inverse transform of the function /' (a:) is 


OO oo 

i J f j e-"’dH*) = 

— OO —CO 



X=oo 
a:— —co 


— 4r J f(x)( — ik)e~ ikx dx 


(we have performed integration by parts here). But condition (143) 
indicates that / (±oo) =0 and therefore the first summand on the 
right-hand side vanishes. The second summand is equal to 

co 

■gr J f(x)e~ ikx dx = ikf(k) 


which is what we set out to prove. 

Transforming formula (141) in a similar way we can prove that 
if the function / is differentiated its Fourier inverse transform / 
is multiplied by — ik. 

3. If the function f (k) is the Fourier transform of the function 
f fir) the function — / ( — ) (a — const > 0) is the Fourier trans- 
form of / (ax). Actually, performing the change of variable ax — s 
we obtain: 


4r J f(**)e~ ihx 


dx — 


1 1 
a 2n 


j f(s)e 


ds = 


— 50 



Therefore, if the graph of the Fourier inverse transform is stretched 
a-fold along the x-axis, the graph of the Fourier transform is con- 
tracted a-fold along the k - axis and vice versa. This means that we 
cannot simultaneously localize (that is concentrate at a certain 
point of the corresponding axis) both a function which serves as 
a Fourier inverse transform and its spectral density. This is the 
so-called uncertainty principle which has many applications in 
physics. 

4. If the function / (x) is shifted by (3 = const along the x-axis 
(i.e. its graph is shifted by the distance P), its Fourier transform 
is multiplied by e -i P fe . In fact, making the substitution x — P = s 
we obtain 


_L j /(x-P) = J f(s)e- ihs e-Wds = e^f(k) 

— OO —oo 

Conversely, if the transform is shifted by p along the k- axis the 
inverse transform is multiplied by e^ x . 
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5. ParsevaVs Theorem. If we apply formula (123) to series (115) 
we obtain 

|/(x)|=*=2! 2 

__ I n=— co 

Taking advantage of formula (139) we deduce from the last relation 
the equality 

j| f{x)\*dx = 2l 2 |/(Ar II )| s (Afc) a = 2n2|/(&n)|*Afc 

— I 7l=— CO 71 

Passing to the limit as l oo we obtain, by analogy with Sec. 32, 
the relation 

OO CO 

J ! / (x)J 2 dx — 2n J |/(/r)| 2 tf/c 


It is this relation that is called Parseval’s theorem for the Fourier 
transform. 

34. Application to Oscillations of Infinite String. The Fourier 
integral transformation is applied to solving some problems of 
mathematical physics for infinite media. We shall illustrate the 
solution of equation (131) in the case of an infinite string, i.e. when 
— oo < x < oo. Hence, there are no boundary conditions in this 
problem and it is only the initial conditions that define the sought- 
for solution. For the sake of simplicity, let us put i|) (a;) s= 0 in 
the initial conditions (133). We denote by u ( k , t) the Fourier trans- 
form of the solution u (x, t) for any fixed value of t 0. Passing 
to the Fourier transforms of the left-hand and right-hand sides of 
equation (131) and taking advantage of properties 1 and 2 in Sec. 33 
we obtain 


<5 2 u 

UW 


a 2 (t/c) 2 u— — a 2 k 2 u. 


For any fixed k , this is an ordinary linear differential equation with 
constant coefficients which can be solved by means of the standard 
methods given in Sec. XV.17. Thus we obtain 


u = Cj (k) e- iaM + C z (k) e iaht (146) 

Now we take the Fourier transforms of the initial conditions 
(133): 

nj (=0 =4(/c) and = 0 

Consequently, formula (146) results in 

u = -i 9 {k) e~ iatk + ~ (k) e iath 
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(verify the calculations!). On the basis of properties 1 and 4 in 
Sec. 33, we, now return to the inverse image: 

1 1 
u = y cp (a: — a<) -by <p (x + <z£) 

It is the last formula that yields the sought-for solution. The 
meaning of the formula is quite simple: the initial deflection from 
the equilibrium state is divided into two equal parts; one of the 
parts is shifted at the moment t by the distance at in the positive 
direction of the z-axis whereas the other is shifted by the same dis- 
tance in the opposite direction. In other words, the two waves propa- 
gate along the string with the speed a in the positive and negative 
directions without changing their initial form. At each moment of 
time we watch the result of the superposition of the waves. Thus, 
we see what is the physical meaning of the constant a entering 
into equation (131): it is equal to the speed at which an initial 
perturbation propagates along the string. 



CHAPTER XVIII 


Elements of the Theory 
of Probability 


§ 1. Random Events and Their Probabilities 

1. Random Events. The theory of probability deals with random 
events. The notion of an event is a basic one, and it is rather difficult 
to give its comprehensive definition. 

For the aims of our course it will be sufficient to regard as an event 
everything that may or may not occur when a certain set of con- 
ditions is realized. Every realization of this kind is called a trial. 
For instance, when tossing a coin we can consider the fact that it 
shows heads to be an event. In this case tossing the coin serves as 
a trial. We can regard it as an event when an article randomly se- 
lected from a lot containing a number of manufactured articles 
turns out to be defective. In this example sampling a unit from the 
lot is a trial. But a trial, as it is understood in the theory of pro- 
bability, must not necessarily be connected with human activities. 
For example, if we consider it to be an event that it will rain in 
a certain place on a certain day, the fact that this day comes should 
be regarded as a trial. 

A characteristic feature of a random event is that it may not 
necessarily occur when a trial is realized. This distinguishes a ran- 
dom event from a deterministic one which inevitably occurs. The 
randomness of an event is connected with the fact that many con- 
comitant factors which are essential for the outcome of a trial may 
not be given. The incompleteness of information can sometimes 
be intrinsic (for instance, in games of chance or in warfare) or can 
result from the inaccessibility of some kind of information at the 
present level of the development of science (for example, in problems 
of weather forecast). The assumption that the outcomes of individual 
trials cannot be predicted is taken as a basic principle in quantum 
mechanics, genetics and some other sciences. Besides, there are 
some cases in which the exact prediction of the outcomes of certain 
trials is possible but not advantageous when it requires unnecessary 
expenditure connected with additional precision measurements and 
the like. 


46—9141 
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Regularities of random events appear in mass-scale phenomena 
when trials are repeated a large number of times. For instance, we 
cannot predict the result of a single toss of a coin because it can 
come up heads or tails. Nobody will find it very strange if we have 
heads twice when we toss a coin ten times. But if we get only 200 
heads after the coin has been tossed 1000 times we have every reason 
to say that there is something wrong with the coin or with tossing. 
Indeed, if the conditions are equal neither heads nor tails have 
any advantage and therefore they must appear approximately 
equal number of times. Of course, when tossing an “honest” coin 
1000 times we may not necessarily have heads exactly 500 times; 
we can have them 490 or 525 times or so but not 200 times! Similarly, 
if we examine a single unit selected at random from a lot we cannot 
have a good judgement on the quality of the lot. This can be done 
only after a sufficiently large number of repeated trials have been 
performed or, as we say, when we have sufficiently large sample 
size. Thus, specifying what was said at the beginning of this section, 
we can say that the theory of probability deals with random events 
which occur in mass-scale phenomena when the corresponding set 
of conditions is realized a large number of times. 

There are two possible ways of understanding the repetition 
of trials. For instance, we can toss one and the same coin 1000 times 
but we can also independently toss 1000 similar coins at different 
instances of time or even simultaneously. Both possibilities are 
quite equivalent, and further we shall not distinguish between 
them. 

2. Probability. In everyday life we often say that a certain event 
is highly probable whereas some other event is improbable. Of 
course, in case the corresponding trials can be repeated many times 
these assertions mean that the former event will occur frequently 
and the latter will occur seldom. 

An important feature of the theory of probability is that it not 
only indicates that the probability of an event is high or low but 
also attributes an exact numerical value to it. Hence, the probability 
of an event is considered to be a numerical value which characterizes 
the frequency of the occurrences of the event in a large number of 
repeated trials. 

Suppose that a coin was tossed 1000 times and that it came up 
heads 490 times. Then the ratio = 0.49 is said to be the relative 

frequency of the coin coming up heads in the given series of trials. 
Let the coin be tossed 10,000 times and let heads appear 5027 times; 
then the relative frequency is equal to 0.5027. It is clear that if 
the coin is symmetric and if the number of trials increases the rela- 
tive frequency of heads must approach 0.5 because neither of the 
faces of the coin has any advantage over the other. It is the number 
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0.5 that is called the probability of beads when the coin is being 

t0 “jn^tbe general case the definition is stated similarly. Denote 
a random event by A. Suppose that the event occurred N A times 

in a series of N independent trials. Then the ratio ^ is called the 

relative frequency of the event A in the given series of trials. 

The' limit 

P W =lnn^ 

to which the relative frequency of the event A tends when the number 
of trials is increased unlimitedly is called the probability of the 
random event A. 

Hence, if the number of trials is sufficiently large the relative 
frequency of an event can be approximately taken as its probability. 
This fact implies a method of empirical calculation of probabilities 
when it is difficult to find them theoretically. Let us consider an 
example. The probability that a new-born child will be a boy is 
known with a great accuracy from the statistics of human popula- 
tion. The probability equals 0.512 although from time to time there 
appears a deviation from this value. Knowing this probability we 
cannot predict whether a new-born child will be a boy or a girl 
for each concrete case. We can only say that the probability of the 
child being a boy is a little higher, i.e. the birth of a boy is a little 
more probable. But nevertheless w r e can assert that the number of 
boys among a million of new-born children will be close to 512,000 
(in Sec. 20 we shall work out some methods for estimating the degree 
of this closeness). 

There are some cases when the probability can be calculated by 
means of figuring the number of favourable trial outcomes (favourable 
cases). We shall illustrate this method by taking a concrete example. 
Suppose that we toss a die on whose faces sums of points from 1 to 6 
are marked. Let it be necessary to find the probability of a throw 
giving a sum of points divisible by 3. Imagine that we have made 
a large number N of throws. Let Nk denote the number of occurren- 
ces of k points (k = 1, 2, 3, 4, 5 and 6). Then we have l\ r j -f- iV 2 + . . . 

■ ■ ■ -^e — N . that is + -f- . . . -j- = 1. But neither 

of the faces having any advantage over the others, all the six frac- 
tions are approximately equal to ~ if N is sufficiently large, and 
m the limit they become exactly equal to each other xvhen N — oo. 
Hence, in the limit the fractions are equal to -i- . But a sum of points 
is multiple of 3 if it equals three or six, and hence the number of 

46 * 



724 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


favourable outcomes is equal to N 3 -f- N G . The relative frequency 
of the event is equal to 

N 3 +N 6 N 3 , JV e . o 1 _ 1 
A' ~ A' N jV-Too 6 3 

This is just the sought-for probability. The result thus obtained 
can be formulated briefly as follows: we have six possible outcomes 
in throwing a die which correspond to the range of possible sums 
of points. There are two outcomes among them which are favourable 
to the event in question, namely throwing three and six. The other 
cases are unfavourable. Hence, the probability of the event equals 
2__ i_ 

6 3' 

The general scheme of such calculations can be described in the 
following way. Suppose that a trial can result in exactly one of n 
possible outcomes, these outcomes being equally probable (in such 
circumstances we also say that we have n equally probable possible 
cases). Let us consider an event A which occurs when q of these 
outcomes appear and does not occur when the other n — q outcomes 
appear. We call these q outcomes favourable to the event A whereas 
the other n — q cases are called unfavourable to A. Then, reasoning 
as in the preceding paragraph, we conclude that 

p 

Thus, the probability of an event is equal to the ratio of the num- 
ber of trial outcomes favourable to the event to the number of all 
possible outcomes. 

Using the scheme of favourable cases we can easily hnd the pro- 
bability of winning a prize for an owner of a lottery ticket (for 
this purpose the total number of prizes should be divided by the 
number of tickets). Many other similar probabilities can be found 
in a similar way. In performing such calculations we must be sure 
that the outcomes of trials are equally possible, i.e. equally pro- 
bable. For instance, it would be wrong to reason in the following 
way: the sum of points obtained when tossing the die can be either 
■divisible by three or indivisible and therefore there is one favourable 
■case among the two for getting a sum multiple of three, and hence 

the sought-for probability equals y (where does the mistake lie 

in this argument?). Of course, we always idealize reality when we 
consider some outcomes to be equally possible, and therefore this 
assumption only approximately holds in concrete problems, with 
a certain accuracy. Such an approach is justified only when there 
is symmetry in trials under consideration. 

There are various modifications of the scheme of figuring the 
.number of favourable cases. Let us consider an example illustrating 
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one of the variants of the method. Suppose we have a homogeneous 
spherical ball with surface area S. Let a portion of the surface having 
an area S 0 be blackened. Let the ball he thrown at random on a hori- 
zontal plane and let it be necessary to calculate the probability 
that the ball strikes the plane with the portion S 0 of its surface. 
To perform the calculation we imagine that the whole surface is 
divided into small parts of equal areas dS. Then the point of impact 
can belong to any of these parts with an equal probability. But the 

total number of these parts is equal to ^ , and the number of the 
parts belonging to the portion S 0 is equal to . Hence, there are 

o 

-T£ cases favourable to the event in question among the total number 

Uu 

s s s 

of cases jg , and thus the sough t-for probability is equal to -Jg : -jg — 

= . If we pose the same problem for an ellipsoid instead of the 

ball the solution will depend not only on the area but also on the 
disposition of the blackened portion on the surface of the ellipsoid. 
The solution involves integration, and we leave it to the reader. 

In conclusion we note that in everyday life the term “probability" 
is sometimes applied to such events which cannot be repeated, even 
mentally. For instance, we sometimes speak about the probability 
of whether there exists life on Mars and the like. In such cases it 
would be better to speak about estimating the likelihood of a hypo- 
thesis. The likelihood theory is not thoroughly developed at present. 
3. Basic Properties of Probabilities. 

1. The probability of any event A is a dimensionless quantity 
whose numerical value lies between the limits 0 and 1: 

0 < P 1 

The property immediately follows from the definition of probabi- 
lity given in Sec. 2. The definition also indicates that the greater 
f the greater the possibility of the occurrence of the event, 
i.e. the greater its probability understood in everyday sense. 

2. The probability of a certain (sure) event, that is of an event 
which unavoidably occurs, is equal to unity.'Thus, we regard a cer- 
tain, deterministic event as a special case of a random event (this 
resembles our arguments in Sec. 1.5 where we considered a constant 
quantity to be a special case of a variable quantity). Further, the 
probability of an impossible event is equal to zero. 

In the case of a finite number of possible outcomes of trials the 
converse assertions are also true. Namely, if the probability is equal 
io unity (zero) the event is certain (impossible). But in the general 
ase these assertions are no longer true. For instance, our discussion 
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in Sec. 2 shows that the probability that a ball thrown at random 
will strike a plane with a point which is set beforehand is equal 
to zero. At the same time such an event is not impossible (theore- 
tically) because its occurrence does not contradict the laws of mecha- 
nics. But of course the event is practically impossible. 

3. The sum of probabilities of any event and its opposite event 
is always equal to unity. We say that two events are contrary or 
opposite to each other if the occurrence of one of them is equivalent 
to the non-occurrence of the other. In other words, each of the two 
contrary events is the negation of the other. If the probability of 
hitting a target under certain conditions is equal to 0.2 the proba- 
bility of failing to hit the target under the same conditions is equal 
to 0.8. To prove this assertion in the general case we denote two 
contrary events by A and A. Let N trials be made and let the event A 
occur N. a times and the event A occur N~ times. Then it is evident 

that N a + Aj- = N which implies ~ -f- = 1. Passing to the 

limit for N — oo we find that 

PW-fP {A} = 1 (1) 

4. We can similarly prove a more general assertion: if a trial 
results in the necessary occurrence of one and only one event belong- 
ing to a group of events A } , A 2 . • ■ ., An we have 

P{A 1 } + P{A 2 } + ...+P{A ft } = l (2) 

5. We now consider two events A and B such that each of them 
may or may not occur when one and the same trial is performed. 
Suppose that N such trials have been made. Let N Aa ndB designate 
the number of trials in which both events occurred and let N Aan dB 
be the number of trials in which A occurred and B did not occur 
and so on. Using this notation we can write 

N = Na and B T" N A an dB ~T Na and B A" N a and B 

Besides, for the total number of trials in which the event A occurred 
and for the total number of trials in which the event B occurred 
we can write 

Na — N A and B T“ NAandJS and N B = N A and B ~P 

T Na and B (3) 

Furthermore, let us denote the number of trials in which at least 
one of the events^A and B occurred by N Ao tb ■ Then we have 

N A or B = N A and B "P Aa an dl3 -f" Na and B 


( 4 ) 
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Formulas (3) and (4) imply 

N A. or B A r A . A and B 

N ~ N ~ r N N 

Passing to tlie limit,' as N -*■ oo, we arrive at the formula 

P {A or B) = P {A} 4- P {B}~ P {A and B) 

in which the sense of the notation is quite clear. 

In particular, if the events A and B are mutually exclusive, that 
is such that they cannot occur simultaneously, we obtain the follow' 
ing theorem of addition of probabilities (addition rule of probability 
theory): 

P {4 or B) = P {A} -{- P { B } (where A and B 

are mutually exclusive) 

The following more general rule is proved in a similar way: if 
the events A j, A 2 , ■ . ., -4/., are pairwise mutually exclusive we have 

P (4i or A 2 . . . or An} — P {-4i} + P {^- 2 } + • • • + P 

(5) 

4. Theorem of Multiplication of Probabilities. Let A and B be 
two events. Then the conditional probability P {A | B} of the event 
A relative to the hypothesis that the event Ji has occurred is the pro- 
bability of the event A calculated on the condition that the event JB 
has taken place. Therefore, when calculating this probability by 
means of the corresponding relative frequency (see Sec. 2), we must 
take into account only those trials whose outcomes resulted in the 
occurrence of the event B : 

PL4|£} = lim B 

,Y->co A B 

For instance, suppose we are given two urns. Let the first urn 
contain three black balls and one white ball and the second one 
contain one black ball and three white balls. Suppose that we ran- 
domly select one of the urns and draw a ball from it at random. 
What is the probability that the ball will be black? The obvious 

symmetry of the possible outcomes indicates that P (Mb/acfc} = -£• 

where Abiack is the event consisting in the occurrence of a black 
ball. We now suppose that it is known that we have selected the 
first urn. Let us denote this event (that is selecting the first urn) 
as B { . Then it is apparent that the conditional probability 
p {Am | Bi} = -| . 
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Taking the simple formula 

N A and B Nb N A and B 

N N N b 

and passing to the limit, as N -*■ oo , we obtain the following mul- 
tiplication rule of probability theory: 

P {A and B) = P {5} P {A \ B} = P {A} P {B \ A) (6) 

(the last expression has been obtained by interchanging the roles 
of A and B). 

Thus, the probability that two events take place simultaneously 
is equal to the product of the probability of one of them by the con- 
ditional probability of the other provided that the first event has 
occurred. 

Formula (6) becomes especially simple when the events A and B 
are independent. We call two events independent if any information 
concerning the occurrence or non-occurrence of one of them does 
not affect the probability of the other. Thus, in this case we have 

P {A \B} = P {A}, P{d|f}=P {A}, 

P {B | A) = P {B}, P {B | A} = P {5} 

(By the way, on the basis of equalities (1), (5) and (6), we can easily 
conclude that each of the above relations implies the other three.) 
Formula (6), for independent events, turns into 

P {A and B} = P {A} P {B} (7) 

(where A and B are independent) 

Formula (7) can be readily extended to the case of an arbitrary 
number of independent events, that is events such that the infor- 
mation concerning the occurrence or non-occurrence of any group 
of these events does not affect the probabilities of the others. For 

example, if A, B and C are independent events we have 

P {A and B and C} = P {A and ( B and C)} = 

= P {A} P { B and C) = P {A} P {B} P {C} (8) 

The above rules enable us to calculate probabilities for some 
simple problems. Let us take an example. Suppose there are three 
shots and each of them fires at a target once. Let the first of them 
hit the target with a probability of 0.2, the second with a proba- 
bility of 0.3 and the third with 0.5. What is the probability that 
the target will be hit at least once? If we denote the probability 
of ftth shot hitting the target as Ah (k = 1, 2, 3) we can say that 
we are interested in the probability P {Ai or A 2 or A 3 ). We cannot 
apply formula (5) here because the events Ah are not mutually 
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exclusive since the target can he simultaneously hit by two or three 
shots. Therefore in this case it is easier to calculate the probability 
that all the shots miss the target because these opposite events are 
independent. Thus, by formulas (1) and (8), we obtain 

P {Ai or A 2 or A 3 } — 1 — P {Ai and A 2 and I 3 } = 

= 1 -P {Ai} P {A 2 } P {I 3 } - 1 — 0.8 X 0.7 x 0.5 = 0.72 

Let us consider one more example. Suppose that we randomly 
draw two halls in succession from an urn containing three black 
balls and one white hall. What is the probability that both balls 
will be black? There can he two variants of the problem. Namely, 
we can consider sampling with replacement. In our case this means 
that we consider drawing a ball with replacement which means 
that the first ball drawn from the urn is replaced in the urn after 
its colour has been noted and before the next drawing is made. 
Hence, the case when the same ball will he drawn a second time 
is not excluded here. Formula (7) is obviously applicable here, 

3 3 

and thus we see that the sought-for probability is equal to = 

9 

= jg. But if we consider drawing without replacement, that is if 

the first hall drawn from the urn is not replaced in the urn after 
its colour has been examined, then this ball does not take part in 
the second sampling, and therefore the sought-for probability is 

3 2 1 * 

calculated by formula (6) which yields . 

5. Theorem of Total Probability. We shall begin with an example. 
Let there be three urns. The first urn contains three black balls 
and one white ball, the second contains one black ball and three 
white balls and the third only three black balls. Suppose we random- 
ly selected one of the urns (with equal probability) and then drew 
a ball from the urn at random. What is the probability of the hall 
being black? If we drew a black ball this obviously indicates that 
we either selected the first urn and drew r a black ball from it or 
did the same with the second or with the third urn. All these three 
variants are pairwise mutually exclusive. By formula (7). the pro- 
bability of the first variant taking place is equal to 1 ■ 1 , the pro- 
bability of the second variant is equal to 1.1 and the probability 

of the third one is equal to -—-1. Hence, the probability of the 
occurrence of one of the variants is equal, by formula (5), to 

1 JL 

3 ' 4 1 3 ’ 4 

This is just the sought-for probability. 


- 1.1 = 1 
3 3 



730 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


Now let us turn to the general case. Suppose that the iesult of 
a trial is the occurrence of one and only one of the k events By, B 2 , . ■ . 

. . Bh which are pairwise mutually exclusive (in the previous 
example the role of such events was played by the selections of one 
of the urns). Besides, let us consider an event A (in the above example 
the role of A was played by the drawing of a black ball). We can 
regard A as being equivalent to the event consisting in the occurrence 
of B i and A, or of B 2 and A,- or of B 3 and A and so on. All the 
last variants being mutually exclusive, formula (5) implies 

P {A} = P {{By and A) or ( B 2 and A) ... or ( Bk and A)} = 

= P {By and A} + P {B 2 and A} -f- - • • + P {Bk and A) 

Prom this, by formula (6), we finally deduce 

P {A} = P {A | By} P {5,} + P {A | B 2 } P {B 2 } + ■•• + 

+ P {A | B h } P {Bh} (9) 

This formula is called the formula of total probability (partition 
formula). It can be applied to problems similar to the one consi- 
dered in the foregoing paragraph. 

6. Formulas for the Probability of Hypotheses. We begin with 
the above example of the three urns again. Suppose that we know 
the distribution of the balls in the urns and that the urns themselves 
are indistinguishable. This means that when we select one of the 
urns at random we do not know which of the urns has been selected. 
Then, considering the three hypotheses that we have selected the 
first urn or the second urn or the third one we conclude that they 
are all equally probable, that is the probability of each of the hy- 

potbeses is equal to y . Now let us draw a ball at random from 

the urn we have selected and let the ball turn out to be white. Then 
we should reappraise the probabilities of the hypotheses. For in- 
stance, after the drawing of a white ball, it becomes clear that the 
urn we have selected cannot be the third one and that it is more 
probable that we have selected the second urn than the first (why?). 
The probabilities calculated before the performance of the experi- 
ment (i.e. before drawing a ball) are called a priori probabilities, 
and the reappraised probabilities are called a posteriori probabilities 
(the term a priori originates from Latin and means presumptive, 
and a posteriori is the reverse of a priori). Now, how can we find 
these reappraised probabilities? 

Let us take the general case. Let there be several hypotheses 
Hy, H 2 . . . ., Hk and let it be known that one and only one of them 
holds. Let the a priori probabilities of the hypotheses be equal to 
P {Hy}. P {H 2 }, . . ., P {Hu}, respectively. Suppose that the 
conditional probabilities P {A | H ,} (i = 1, 2, . . ., k) of an event 
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A relative to each of the hypotheses are known. Then the a posteriori 
probabilities of the hypotheses are nothing but the probabilities 
P {Hi M} (i = 1, 2, .... k). To calculate them we write, on 
the basis of (6), the relations 

P {A} P {Hi \A} = P {Hi} P {A j Hi} 


and then, applying formula (9), deduce 


P {Hi\A} = 


PMlg t )P{ffi} 

P {A | HA P {Hi} + P {A | HA P {HA + . . . +P {A \ H h } P \H h } 
i — 1 , . . . , k 


These are the sougkt-for formulas for the probability of hypotheses 
(Bayes’ theorem). Let the reader apply the formulas to verify that 
the reappraised probabilities in the problem considered in the pre- 

1 3 

ceding paragraph are equal to , — and 0, respectively. 

7. Disregarding Low-Probability Events. We see that the methods 
of the theory of probability enable us to calculate the probabilities 
of various events. How can we utilize these results? One can hardly 
be satisfied with the answer that a given event will either occur 
or not. 

There is an approach to the problem which is typical of applied 
mathematics. It is based on the idea that if the probability’- of an 
event .4 under consideration is sufficiently small, that is if P {ri} < 
< e where e is a sufficiently’- small positive number, we can appro- 
ximately put P {A} = 0 and thus consider the event A to be prac- 
tically impossible. In such a case we simply disregard the possibility’ 
that A may'- occur. Of course, this does not exclude the theoretical 
possibility of the occurrence of A, and therefore the prediction that 
A will not occur may turn out to be wrong. But the smaller e, the 
rarer the occurrences of the event. 

But how can we choose e? There are various traditions concerning 
this question in different divisions of applied mathematics. If 
there is nothing dangerous in the occurrence of the event A , that is 
if the error introduced by’ the incorrect prediction can be easily’ 
corrected, -we can put s = 0.1. This means that in the long run 
approximately 10 per cent of predictions will be false. But if a 
higher reliability’ is not connected with essential difficulties we 
usually put e — 0.01. For instance, if we toss a coin 100 times the 
meaning of the choice of e = 0.01 is that we disregard the possibi- 
lity of such events as the coin coming up heads seven times in suc- 
cession because 2 7 100. For still more accurate predictions we 

can put e = 0.001; then the average frequency of incorrect predic- 
tions will be about one per thousand and so on. The smaller e, the 
more accurate the prediction. But at the same time it is more diffi- 
cult to guarantee such an accuracy when e is decreased. The accuracy’ 
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should be particularly great if an incorrect prediction may be con- 
nected with casualties. In such cases we sometimes cannot rely 
upon probabilistic inferences, and then we have to resort to deter- 
ministic ones. 

In Sec. 14 we shall discuss some methods of choosing criteria 
(i.e. choosing e) according to which events can be considered to be 
practically impossible. 

§ 2. Random Variables 

8. Definitions. A random variable is a variable quantity which 
randomly assumes a certain numerical value resulting from the out- 
come of a trial. This value depends on chance and, generally speak- 
ing, varies as the trials are repeated. 

Examples of random variables are the number of students atten- 
ding a lecture, the length of a manufactured article taken from 
a lot, the duration of life of a person and so on. 

Like every quantity (see Sec. 1.5), a random variable can be 
discrete or continuous. For instance, the first random variable in 
the above examples is discrete whereas the other two are continuous. 
It is essential here that even before a trial has been made we know 
that the possible values of the number of students are integral, where- 
as it is impossible to set beforehand the possible discrete values 
of the lengths of the articles. 

To obtain a representation of a discrete random variable we can 
enumerate all its possible values and indicate the probabilities 
with which these values are assumed. This results in a table of the 
following form: 


DISCRETE RANDOM VARIABLE \ 


values of | 



*3 


probabilities 

Pi 

P z 

P 3 



Such a table can he finite or infinite (theoretically). It is apparent 
that all the probabilities P ^ must be non-negative and their sum, 
according to formula (2), must be equal to unity. In the special 
case when there is only one possible value it must be assumed with 
probability 1, that is the variable in question necessarily assumes 
this value. Thus, in such a case we have a deterministic quantity. 

A continuous random variable c, can assume all the numerical 
calues or all the values belonging to some interval (or to a system 
if intervals). But the probability that such a random variable will 
exactly take on any value x set beforehand is equal to zero. (This 
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situation is similar to that of a continuously distributed mass when 
the mass of any separate point is considered to be equal to zero.) 
But we can speak about the probability that the random variable E 
will assume a value belonging to a given interval of the x-axis. 
The probability of the value of l to fall in an infinitesimal interval 
from x to x -j- dx is also infinitesimal; it is directly proportional 
to dir and depends on x. Hence, this probability is equal to an expres- 
sion of the form p ( x ) dx where p (x) is the so-called probability 
density function (the density of probability distribution or the fre- 
quency function of the probability distribution). This function 
completely characterizes the random variable Apparently, we 
always have p (x) > 0 (—00 < x < 00). Formula ( 5 ) shows that 
the probability that the value assumed by the random variable e 

b 

belongs to an interval a ^ x b is equal to | p (x) dx. Hence, 

a 

by formula (2), there must be 

00 

J p(x)dx = l (ii) 

— CO 

The expression p (a:) dx is called the element of the probability 
distribution. Henceforward we shall write integrals taken from 
—00 to cxi without indicating the limits of integration; for instance, 

formula ( 11 ) will be put down as J p ( x ) dx = 1 . This will not 
lead to any misunderstandings because we shall not deal Avith inde- 
finite integrals in this chapter. (By the wmy, the sign j is rarely 

used in mathematical applications for denoting indefinite integrals. 
It usually designates definite integrals for which the limits of inte- 
gration are implied by the corresponding physical or mathematical 
meaning of the integrals. For instance, Ave sometimes mean that 

the sign j designates definite integrals taken over maximal ranges 

of A r ariation of the corresponding A r ariables of integration.) 

Using the notion of the delta function (see Sec. XIY. 25 ) Ave can also 
introduce the probability density function for a discrete random 
variable. For instance, if such a random A r ariable is represented by 
a table of form (10) we have 

P (x) = P18 ( x — x t ) -f- P 2 6 (x — x 2 ) + H38 (x — x 3 ) -f . . . 

Delta functions also enable us to consider the density of probability 
distribution for a random variable of a mixed (continuous-discrete) 
type. In the general case the representation of a random variable 
is equivalent to the construction of a non-negative measure in the 
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straight line (see Sec. XVI. 19) such that the total measure of the 
whole straight line is equal to unity. Then, such a probabi- 
lity measure being given, the probability that the value of the ran- 
dom variable will fall in an interval is equal to the measure of the 
interval. 

9. Examples of Discrete Random Variables. We now consider 
a random event A with P {X} = P. Suppose that we have made 
one trial. Then how many times can the event A occur? Evidently, 
this number is equal either to 1 or to 0. Hence, we have obtained 
a random variable which can assume only two values, namely the 
value 1 with probability P and the value 0 with probability 1 — P 
[see formula (1)]. 

Now let the trials be performed several times. For definiteness, 
let there be three trials. Suppose that the event A has occurred v 
times in these trials. Then v can be regarded as a random variable 
whose possible values are 0, 1, 2 and 3. Let us compute the proba- 
bilities of these values. If v = 0 the event A does not occur in all 
the three trials. The trials being independent, the probability that 
v = 0 can be found by formula (8). This results in the probability 
equal to (1 — P) 3 . The value v = 1 can be obtained in the follow- 
ing three variants: the event A occurs in the first (second or third) 
trial and does not occur in the other two trials. The probability 
of each variant is again found by formula (8) which yields the result 
P (1 — P)~. Therefore, according to formula (5), the probability 
that we shall have one of the variants is equal to 3P (1 — P) 2 . 
The cases v = 2 and v = 3 are treated similarly, and thus we arrive 
at the following table: 


values of v 

0 

1 

2 

3 

probabilities 


3 P (1 — P) 2 

3P*{l-P) 



(Let the reader check up that the sum of the probabilities thus ob- 
tained is equal to unity!) 

The general case of n trials is investigated in like manner. Let 
again v be the number of the occurrences of the event A. Then v 
is a random variable for which we obtain the table 


values of v 

0 

1 

2 

■ 

n 

probabilities 

(1— P)” 



■ 

pn 
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is a binomial coefficient which is 

equal to the number of possible cases in which the event A occurs 
exactly k times, i.e. it is equal to the number of combinations 
of A - elements from n. The set of probabilities collected in the above 
table is called the law of binomial probability distribution (or, brie- 
fly, the binomial distribution). 

Example. A coin is tossed six times. What is the probability 
that it will come up heads exactly three times? 

Answer: 

Now let us investigate the behaviour of the binomial distribution 
when n, the number of trials, is very large whereas the probability 
of the event A is very small so that there is a relation Pn — a 
where a is a constant. For this purpose we pass to the limit in the 
formula 


P {v = k) = ) P h (1 ~ P) n ~ !l = 


n(n — 1 ) . . 

. . (/2 — k 1 ) 

f_SL V 

k ( \ _ JL\ 


k\ 

l n ) 

r 


(where P (v = kj designates the probability that v — k) as n — oo. 
This results in the limiting formula 

P{v = fc} = ^-c-« (A- = 0, 1, 2, ...) 

The calculations connected with the deduction of the formula are 
left to the reader. Thus, we arrive at a random variable which can 
assume infinitely many different values. The probability distribu- 
tion thus obtained is illustrated by the following table: 


values of v 

0 

1 

2 


k 


probabilities 

e-a 

a 

— - e~ a 

1! 

01“ 

2T e 

... 

a h 
k\ 6 

... 


q "t! 1 *. 5 * S ^ 10 so ' ca ^ e ^ Poisson law (Poisson distribution) named after 
o. Poisson (1781-1840), a French mechanician, physicist and mathe- 
matician. 

An example of a random variable distributed according to the 
t ojsson law is the number of atoms of a certain mass of some slowly 
disintegrating radioactive substance which decay during a suffi- 
ciently long time interval so chosen that it should be possible to 
observe the disintegrations of separate atoms. Then the Poisson law 
18 m fact applicable because under these conditions the decay of 


Here (".) = 


n(n— 1) ... (n — Ar-f- 
A-! 
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•an atom is independent of the disintegrations of other atoms and 
all the atoms disintegrate with equal probability. There is a number 
of other similar examples. 

10. Examples of Continuous Random Variables. One of the simp- 
lest examples is a random variable which is uniformly distributed 
over an interval a ^ x ^ b, that is which can assume all the values 
belonging to the interval with equal probability and does 
not assume the values lying outside the interval. The probability 
density of such a variable is put down in the form 


P 



c for 

0 for x<Za and for x^>b 


Condition (11) implies that c = - . The graph of this function 

is shown in Fig. 343. A uniformly distributed random variable is 
sometimes said to have a rectangular distribution (Fig. 343 illustra- 
tes the origin of the name). For instance, the round-off error which 


1 ' 
F~a 

Pty 

'! I 

; i 


a b 


Fig. 343 

results from rounding the numerical value of a quantity to its nearest 
integer is a uniformly distributed random variable, and we have 
a = — 0.5, b = 0.5 and c = 1 in this case (why is it so?). 

The most widely spread random variables are distributed accor- 
ding to the so-called normal law (Gaussian law). The density func- 
tion of such a random variable is expressed by the formula 

p (sc) — Me-^x-v) 2 = M exp [ — p ( x — af\ 

where a, M >0 and (3 > 0 are some numerical parameters. The 
parameter M can be easily expressed in terms of [3. To achieve this 
we must take formula (11) and substitute s = (x — a) in it. 

Then, using integral (XIV. 72), we deduce M = ]/^ (let the rea- 
der verify the calculations!). For our further aims (see Sec. 15) it 
will be convenient to introduce the notation [3 = ^ and to put 
•down the expression of p (x) in the form: 

p( * )= yS7“ p [- !£ i?] 


( 12 ) 
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The graph of density function (12) is shown in Fig. 344. In § 4 we 
shall* discuss why the Gaussian law is so widely applied. 

The probability that a random variable § distributed in accord 
with law (12) falls in an interval 
a <£ < b is equal to 

X \ exp f — dx ( 13 ) 

a Fig. 344 

The above probability can be 

easily expressed by means of the probability integral 

03 (/) = ]/ j exp ( ~ "Y") 4s 
o 

for which there are extensive tables (for instance, see [23], [41] and 
[48], Indeed, substituting s = AjzA- into (13) we obtain the 
expression 

b-a 
o 

P j exp ( ^-) ds — 

a 

/ b-g 

“t/fI ] G *v(~T) ds - J exp(-^-)ds 

'o e 

= 4 [" > ( ± ?)- <I, (^ L )] ( w ) 

11. Joint Distribution of Several Random Variables. We shall 
confine ourselves to the case of continuous random variables. More- 
over, for simplicity’s sake, we shall consider a system of two variab- 
les. Discrete variables and systems 'of more than two variables are 
investigated in a similar way. 

Let us simultaneously consider two random variables | and rj 
which take on certain numerical values in one and the same trial, 
then the probability of | falling in an interval between x and x -f- dx 
and t] falling between y and y -j- dy should be proportional both 
to cte and to dy. Hence, this probability is equal to an expression 
° p ( x ’ 4) dx dl J- The function p (x, y) is referred to as the 
probability density (frequency function) of the joint (simultaneous) 
47-0141 
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distribution of the random variables g and T|. This function comple- 
tely characterizes the pair of random variables g, q. Obviously, 
•the function must satisfy the conditions 

P(x, y)>0 and j dx j p(x, y)dy = l 

If the probability density of the joint distribution of two random 
variables g and q is known we can easily find the probability den- 
sity of each of the variables g and r) (the densities of the so-called 
marginal distributions of g and ri). Actually, formula (5) implies 
that the probability that g will assume a value lying between x 
and x + dx when q can have an arbitrary value is equal to 

CO 

P{x<g<a; + da;}= j P (x, y) dx dy — (j p {x, y) dy'} dx 

y=-co 

It follows, by Sec. 9, that the density of the probability distribu- 
tion of the variable g is a function p- z (x) of the form 

Pi (x) = j p (x, y) dy 
We similarly deduce the expression 

Pn (y) = | P (z. y) dx 

for the probability density of q. But the converse transition from 
Pl (x) and p n (y) to p (x, y) may be impossible in the general case, 
that is it may be impossible to restore p (x, y) knowing only pi {x) 
and p v (y), because here an essential role is played by the “interac- 
tion” between the variables g and q. 

There is an important special case when p {x, y) can be obtained 
on the basis of pi (x) and p^ (y). This is the case when the random 
variables g and q are independent, i.e. when any information con- 
cerning one of them does not affect the probability of a numerical 
value assumed by the other. In this case formula (7) implies that 

P(x, y)dxdy = F{x-iCl<Cx + dx, #<q<J/-f dy} = 

= P + dx} P {!/<q<!/-l- dy} — Pi (x) dx-Pv (y) dy 

i.e. 

P(x, y) = Pi(z)p n (y) (15) 

Conversely, if condition (15) holds we can show that the random 
variables g and q are independent. 

There is a more general notion of a multidimensional random 
variable related to systems of random variables. Such a variable 
takes on the values which are the elements of a multidimensional 
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space (R) (see Sec. X.2). The law of probability distribution of 
a random variable of this type is represented by a non-negative 
measure defined in (R) (see Sec. XVI.19), the measure of the whole 
space (i?) being equal to unity. The measure of a region belonging 
to the space is nothing but the probability that the variable in 
question falls in the region. If this measure in ( R ) is such that it 
is possible to perform differentiation with respect to it (for instance, 
such as the Lebesgue measure in a finite-dimensional Euclidean space 
mentioned in Sec. XVI.19) then, differentiating, we can pass to the 
probability density (see Sec. XVI. 7). If we introduce generalized 
coordinates t u t 2 , . . ., t h in (R), we can consider the set of the coor- 
dinates of the element of tbe space (R) which represents the multi- 
dimensional random quantity in question instead of tbe quantity 
itself. Thus we come to a system of several random variables having 
a probability density of their joint distribution. 

As a simple example illustrating what has just been said, we 
consider tbe n -dimensional normal (Gaussian) law. This is the law 
of probability distribution of a random vector % in tbe space E n 
(see Sec. VII. 18) whoso probability density is of the form 

p (x) = M exp ( — x*Ax) (16) 

where A is a positive-definite symmetric matrix (see Secs. XI. 11 
and XII. 7) and M is a normalization factor so chosen that tbe inte- 
gral of p (x) taken over tbe whole space should be equal to unity. 
As is known from Sec. XI.ll, tbe quadratic form x*Ax can be redu- 
ced to a diagonal form by means of introducing a new Cartesian 
basis in E n ■ Hence, after the basis has been introduced, frequency 
function (16) is transformed to the form 

P (*') = M exp (-hx'p- % 2 z? - ... - X n .Tn) = 

= M exp ( — bp?) exp ( — X 2 x?) ... exp ( — X„x£) 

where . . ., % n are t he eigenvalu es of the matrix A. This en- 

ables us to easily find M=V 7 H X 2 . . . X n jt~ n / 2 . Besides, formula (15) 
indicates that the coordinates of the random vector with respect 
to the new basis are independent random variables. 

12. Functions of Random Variables. If i] = f (Q where | is a 
random variable, p is also a random variable. Besides, if l is dis- 
crete (continuous), -q is also discrete (continuous). If £ is represented 
by a table of form (10) then, generally speaking, tj will assume the 
value / (xj) with the probability p u tbe value / (x 2 ) with the proba- 
nihty p 2 and so on. But at the same time we must take into account 
tbe fact that if / (x,-) = / ( Xj ) (where i ^ j) then, of course, the cor- 
responding probabilities Pi and Pj are added together. For instance 
it 6 takes on the values —2, —1, 0, 1 and 2 with the same probabi- 
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lity , it follows that £ 2 takes on the values 0, 1 and 4 with the 
o 

12 2 

corresponding probabilities -g- ,-g- and -g- (why?). 

If n = / (g) and | is a continuous random variable with the 
probability density (x) then in the case T] = / (£) is an increasing 
function of £ its probability density ( y ) is expressed by the for- 
mula 

Pn (y)=-^ v {y< y \<y + d y ) = p {*<£<*+ fa ) = 

= J_n. aM- 

where a: entering into the right-hand side is found from the equation 
/ ( 2 ) = y • If the function / (£) is a decreasing one | /' (x) [ should 
be substituted for f (x) into the right-hand side. Finally, if the 
function / (|) is a non-monotone one, the right-hand side should 
be replaced by the sum of analogous expressions, the summation 
being extended over all the solutions of the equation / (x) = y. 

We can similarly investigate functions of several random variab- 
les. For example, let us take a function of the form £ — / (£, t]) 
where the pair of random variables g, r) is characterized by the den- 
sity of their joint probability distribution p (x, tj). Then 

/ , E( 2 ) = -^- p { z <^<2 + «f2} = -4' jj p(x,y)dxdy (17) 

/(*. y)s£z 

In the general case the derivative entering into the right-hand side 
of (17) is computed according to the rules given in Sec. XVI. 18. 

We shall illustrate the above result by applying it to calculating 
the probability density p £ (z) of the sum £ = g + r| of two inde- 
pendent random variables g and rj. By formulas (15) and (17), we 
obtain 

Pi ( 2 ) = j j Pl{x)Prs (y) dx dy = 

z—x z-x 

= 1 Pi( x ) Pi\(y) d y = -^ j [ j Pv,{y)dy'^p^{x)dx = 

— co — oo 

Z—X 

= j j Pr){y)dy^Pi(x)dx= j P|(a:)p n (z— x) dx 

— OO 

(let the reader verify the calculations!). 

As an exercise, we suggest that the reader should prove that the 
sum of two independent random variables each of which is uniform- 
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ly distributed over the same interval 0 <1 x ^ 1 has the probabili- 
ty density of the form 


/>(*)=< 


0 for £-<0 and x^>2 

x for O^ar-^1 

2 — x for 1 < x<C% 


§ 3. Numerical Characteristics of Random Variables 

13. The Mean Value. Let there be a discrete random variable E 
represented by a table of form (10). Suppose that a great number N 
of trials have been performed. What will be the arithmetic mean 

of the values 1 thus obtained? To answer the question we denote 

by Ni the number of the outcomes of the trials in which | has assu- 
med the value x Then the sought-for arithmetic mean is equal to 

NiXi+N&z+NsXs-V ... A’i , A r - j_ , 

N Xl ~W ‘ Xz ~N~ ‘ * 3 N 

But as we know from Sec. 2, we have Pi when N oo. Hence, 

in the limit, we obtain the expression 

XiPi -f- XsP 2 + ^3^3 -{-••• (IS) 

It is referred to as the mean value (mathematical expectation or, 
briefly, expectation or centre of distribution) of the random variable 
E- This is one of the most important characteristics of The mean 
of | is usually designated as E or M {E} or ME- It should be noted 
that the mean value of a random variable is no longer a random 
variable but is a deterministic quantity. (For instance, verify that 
the mean value of the sums of points obtained in throwing a die 
is the constant number 3.5.) 

Formula (18) can be obviously generalized for the case when E 
is a continuous random variable with the probability density p (x): 

1= 2 S *P (x) dx=^xp (x) dx (19) 

[We have put doAvn the summation sign to stress the analogy bet- 
ween formulas (18) and (19); of course there must be the sign of 
integration here which has been written in the last expression en- 
tering into (19).] Let the reader verify, by means of the formula, 
that the means of the variables considered in Sec. 10 are, respecti- 

vel Y > ~ 2 ~ an< l a ■ But these results are obviously implied by the 
symmetry of the distributions. 
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14. Properties of the Mean Value. 

1. The definition implies that the mean value if of a random variab- 
le £ has the same dimension as £ and that it lies between the greatest 
and the least possible values of g. 

2. If we multiply a random variable by a constant (i.e. by a con- 
stant deterministic quantity) its mean value will be multiplied by 
the same constant: M {CS} = CM {g} (C = const). This follows 
from Sec. 13 because the multiplication of all the values by a con- 
stant yields the multiplication of the arithmetic mean value by the 
same constant. The next property is proved in a similar way. 

3. The mean of the sum of two random variables equals the sum 
of their means M (g + q} = M {g} + M {q}. In particular, if 
a constant is added to a random variable, the same constant is added 
to its mean value. 

By the way, applying the last property we can readily find the 
mean value of a random variable g distributed according to the 
binomial law (see Sec. 9). Let us consider independent random va- 
riables gj, g 2 , • • -i In which take on the value 1 with the proba- 
bility P and the value 0 with the probability 1 — P. Apparently, 
we can interpret g ; as a variable indicating the number of the occur- 
rences of an event A in the ith trial, the probability of A being P. 
Then the variable in question can be represented in the form 

i — il + i2 + • • • + in (20) 

(see Sec. 9 where an analogous variable was denoted by v). From 
(20) we obtain M {g} = M {g,} + M {g 2 } + . . . + M {g n } = nP. 
This result is directly implied by the meaning of the variable, and 
we could have guessed it without applying the above calculations. 
It follows that for a Poisson distribution (Sec. 9) we have M {g} = a. 

4. The mean of the product of two independent random variables 
is equal to the product of their mean values: 

M {gq} = M {g} M {q} if £ and q are independent 

Indeed, if g takes the values x, with the probabilities p t and q takes 
the values with the probabilities q } then gq assumes the values 
xyjj with the probabilities p,qj since g and q are independent (see 
Sec. 4). Therefore 

M {ifi} = 2 xPJj ( Piqj ) =2(2 XiyjPiU) = 

*, J i ; 

= 2 x,Pi 2 UjIj = M {1} M {q} 

Properties 3 and 4 are immediately extended to an arbitrary num- 
ber of summands and factors. It should be noted that the condition 
that the factors should be independent is essential for property 4. 
If the condition does not hold the property no longer remains true 
in the general case For example, if we square a random variable, 
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i e. multiply it by itself, then, as a rule^tbe mmm value of the square 
does not equal the square of the mean: g 2 ^ (g) 2 . For instance, in the 
example considered at the beginning of Sec. 12 we have M {|} = 0 
but M{t 2 } = 2 (check it up!). 

5. If a random variable g assumes the values which are placed 
symmetrically with respect to a constant a with equal probabilities, 
(hen f = a (this is obvious). 

6. If a random variable g is represented by a table of form (10) 
it follows that / (|) = 2 f (z t ) pi- If a continuous random variable 

i 

g has the probability density p ( x ) we have /(g) = j / (x) p ( x ) dx 

(this immediately follows from the definitions). 

In particular, .the calculation of the mean of a random variable 
enables us to set a criterion according to which a random event 
can be considered to be practi- 
cally impossible (that is to set 
a certain value of the quanti- 
ty e mentioned in Sec. 7). Here 
we shall give only some simple 
considerations concerning this 
question. Suppose we have agreed 
that the random events whose 
probabilities are less than a cer- 
tain value s are disregarded, i.e. 
they are considered to be practi- 
cally impossible. But there can 
be an incorrect prediction, and 
this means that an event which 
is regarded as impossible may 
nevertheless occur. Let the loss 
connected with the incorrect pre- 
diction be equal to an amount 
k expressed in certain monetary 
units. Then the average (mean) 
loss will be equal to ek. It is 
therefore desirable to decrease e, 
but at the same time the perfection of the predictions also involves 
some additional expenditure. Let us designate the cost of a trial 
which can guarantee a prediction “accurate to e” as / (e). In many 
concrete problems such a function can be approximately found. An 
example of the *graph of a function of this type is represented in 
big- 345. Hence, the average loss connected with incorrect predic- 
tions equals / (e) + /re, and thus the value s = e 0 which is to be 

set as a criterion must be chosen so that this sum should be mini- 
mized. 
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In conclusion we shall give several remarks concerning multidi- 
mensional random quantities (see Sec. 11) which assume their values 
in a finite-dimensional linear space (R) (see Sec. VII. 17). Thus, we 
shall speak about finite-dimensional random vectors. The formula 
of the mean value of such a random variable is similar to (19): 

f= j xdP = j xp(x)dR 
(R) <R) 

where the integration is extended over the whole space (R) and dP 
is the differential of volume (the element of volume) in (R). All 
the properties of the mean in this case are analogous to those of the 
scalar (one-dimensional) case. Besides, property 4 holds for all 
kinds of products in which it is permissible to remove brackets ac- 
cording to the ordinary arithmetical rules (e.g. for the product of 
a vector by a scalar, for the scalar or vector product of vectors and so 
on). 

15. Variance. The variance characterizes the degree of the spread 
of a random variable about its mean (expectation). Let us be given 
a random variable £. By definition, its variance (also called disper- 
sion) is the quantity 

D| = D {S} = M {(£— MS) 2 } (21) 

This quantity is deterministic and always positive except the case 
when E itself is a deterministic quantity (in this case we have Dg = 
= 0 )/ 

By property 6 in Sec. 14, formula (21) implies the formulas 
D£ = 2 (*i - 1) 2 Pi and Dg = j (x — |) z p ( x ) dx 

l 

From (21), we easily see that if | is multiplied by a constant C 
then D£ is multiplied by C 2 and that if a constant is added to g 
its variance II g does not change. Further, if two random variables 
£ and T] are independent, we have 

D{£ + t,} = DS+Dt, (22) 

In fact, 

D {g+ Tl} = M {[£ + T| — M (g+ 11)]*} = M {[(!— M|) + 

+ (Tl- M ti) 1 2 } = 3VI (g— MS) 2 + 2»I {(£- ME) (fi - Mfi)} + 

+ M (il — Mr)) 2 = D£ + 2M (£ — MS) • M (p - Mr)) + Dp = 

= Dg + 2.0.0-i-Dq = D£-i-Dr| 

(where has the independence of the random variables g and rj been 
used in the above calculations?). 
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Let ns determine the dispersions for the examples considered in 
Cppc q and 10 The variance of each summand entering into formu- 
la (20) is equal to (0 - Pf d - P) + d - Pf P = P (1 - JP). 
Hence, by formula (22), we obtain the expression D£ = nP (1 — P} 
for the binomial law. Now passing to the limit, as oo, we ob- 
tain the expression D| = a for the Poisson distribution. For the- 
uniform distribution over an interval a ^ x < b we deduce 


D 5 =j(* 


£±i) 2 p(x)Ac= j [x 


a + 6 \ ^ 

2 j 



(b — a) 2 
12 


(check up the result!). Finally, for the normal law we obtain 
D|= ( (*-“>' 


Substituting s = -^=^- into the integral we get 

D £ J s2ex P (~ s2 ) * 

Now, putting s — u and dv = s exp ( — s 2 ) £that is ds — du and' 
v — — -i-exp( — s' 2 )J we integrate by parts and thus deduce the result 
D| = -L=r f exp ( — s 2 ) ds — a 2 

]/ 71 J 

Together with the variance D|, we often use the square root of 
it which is called the standard deviation of the random variable %. 
The standard deviation is of the same dimension as We 

see that the parameter a entering into Gaussian law (12) is nothing 
but the standard deviation of the normally distributed random 
variable |. 

Formula (22) implies an important consequence. Let random va- 
riables gj, i 2 , . . | n be independent and let them he distributed 
according to the same law with the standard deviation a. Then their 
sum has the dispersion no 2 , and therefore its standard deviation is 
equal to ]P no. Now notice that the above two examples indicate 
that the values of a random variable which correspond to the higher 
probability are concentrated on an interval whose length is directly 
proportional to the standard deviation (this property will be dis- 
cussed in more detail in Sec. 19). Hence, for the sum of independent 
randojn summands, the length of such an interval is proportional 
fo Vn (but not to n as it would be if we had n equal summands). 
In particular, this law holds for the error of the sum of several sum- 
mands which are known with the same accuracy (compare with 
Sec. 1.9). 
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Two different random variables distributed according to different 
laws may have the same expectations and the same variances. There- 
fore, to obtain a more complete description of random variables, 
we also use some other numerical characteristics. In particular, we 
introduce the so-called moments of a random variable (of its pro- 
bability distribution) which are defined as 

M{g fe } = j x h p (x) dx (k = l, 2, 3, . . .) 

under the assumption that the corresponding integrals are conver- 
gent. This is the ftth moment (the moment of order k) of the variable 
g. The first moment is nothing but the expectation (mean value), 
and the variance, as it is implied by formula (21), is expressed in 
terms of the moments of the second order: 

Dg = M {g 2 } - 2M {gMg} + M {(Mg) 2 } = M {g 2 } - (Mg) 2 

The higher-order moments characterize the law of probability dis- 
tribution of a random variable more completely than the expecta- 
tion and the variance. 

16. Correlation. Let us be given the probability density p ( x , y) 
of a joint distribution of two random variables g and rj (see Sec. 11). 
If it is known that the variable g has assumed the value g = a we 
can speak about the conditional probability distribution of the ran- 
dom variable i] relative to the hypothesis that g takes the value a. 
Let us denote the corresponding conditional density function of t) 
relative to the hypothesis g = a as (y | g = a)*. Then the con- 
ditional probability of r) falling in an interval between y and y -j- dy 
provided g takes the value g = a is equal to p n (y | g = a) dy, and 

j Pn (y I l=a) dy = i 

* Translator's note. As it was shown in Sec. 11, knowing the density p ( x , y) 
of the joint distribution we can find the densities of the marginal distributions 

P% (*) = | P (®, y) dy and p n (y) = j p [x, y) dx of the random variables g 

and t). Now, according to formula (6), we can write 


p (y<i)<y+dy | g = a) =lim 

dx-+ 0 


P (a < g<a-f dz, ?/<T| <y + dy) 
P( a <S <a-\-dx) 


= lim P (°. V) dx dy 

dx->0 Pi (“) dx 


p(a . y ) 
Pi (a) 


dy 


which implies 


Pn (»|g = o) = 


p(a, y ) 
Pi (“) 


The conditional probability density p* {x | r) = b) of g relative to the hypothesis 
1) = b is found in likp manner: 


Pl & |ti = &) = 


P(x, V) 

P n( 6 ) 
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When the value of £ entering into the hypothesis varies the law 
of the conditional probability distribution of the variable q changes 
in the geneial case. Hence, there is a certain relationship between 
| and r] but it differs from an ordinary functional relationship bet- 
ween deterministic variables which was studied in the foregoing 
chapters. 

A relationship of this kind is called a correlation. Similarly, 
if we are given a joint distribution of an arbitrary number of 
random variables we can define the correlation between any of the 
variables and the rest. 

. We often encounter relationships of a correlation type. For in- 
stance, when we speak about the relationship between the weight 
of a person and his height we undoubtedly mean a correlative rela- 
tion because we know that the weight is not completely and uniquely 
specified by the height. At the same time it is quite clear that the 
law of distribution of the weights of the people two metres high 
differs from that of the people one and a half metres high. When 
we say that smoking reduces the duration of life of a person we also 
mean a correlative dependence because, although there are many 
cases of different kind, we nevertheless find that the average duration 
of life of non-smokers is higher than that of smokers if we consider 
the law of probability distribution of the duration of life. We must 
carefully distinguish between the deterministic and correlative 
dependences and also take into account that in the latter case the 
existence of contradictory examples does not affect the general 
validity of probability inferences. 

The mean value of the conditional probability distribution of 
the random variable q is a deterministic function of x (if x designates 
the numerical value assumed by the variable g which we denoted 
as g = x — a above): 

M {q 1 1 = a;} = j yp n (y | g = x) dy 

Let us. denote this function as / (x). The function f (x) (called the 
regression function of q on g) expresses the conditional mean value 
of r] relative to the hypothesis g = x\ the graph of the function is re- 
ferred to as the regression curve for the mean of q (the regression 
line of q on g). In the above example of the duration of life of smo- 
kers, it is the regression that defines the regularities we are interes- 
ted in. We can similarly determine the conditional mean value 
9 (y) of g relative to the hypothesis that q = y and construct the 
corresponding regression curve for the mean of g. It is interesting 
that, generally speaking, the functions / (a;) and <p (y) are not inverse 
with respect to each other as it would be if we had a deterministic 
relationship. This becomes especially clear in the case of independent 
random variables g and q when both conditional means are con- 
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stant and the corresponding regression curves turn into straight 
lines parallel to the p-axis and to the rc-axis, respectively. 

There is a comparatively simple special case when the regression 
of both variables £ and T| is linear, i.e. when both regression func- 
tions / (a:). and <p (y) are linear. Let us introduce the notation 

, 15-15 

^ Vom 

The quantity r ?i ^ is called the correlation coefficient of the random 
variables | and q. It is possible to prove that we always have | rg_ ^ 

^ 1 and that in the case when both functions / (x) and cp ( y ) are 
linear they have the form 

/(*) = r l ,. n ]/-j^(.r — f)mq and <P (y) = r £> „ j/~^ (y — r\) +1 

but we shall not give the proof here. It follows that if >0 both 
functions are increasing and if r ?i < 0 they are decreasing. 

An important example of a linear correlation is the two-dimensio- 
nal normal law (see Sec. 11) with the density function 

p ( x , y) = M exp [— {Ax 2 + 2 Bxy -f Cif)] 

where M is the normalization factor aDd the quadratic form in the 
parentheses is positive-definite. We suggest that the reader prove 
that in this case we have 

B B B 

f{x)=--^x, T(2/)=— -j-y and ri,r,= —-y==- 

17. Characteristic Functions. The characteristic function of a 
random variable | is a function of a real parameter u of the form 

<p 6 («) = M {e <u £} ( — oo«<m<<oo) 

Property 6 in Sec. 14 enables us to write in full the expression of 
a characteristic function in the form of a sum or of an integral: 

Ti («) = 2 p hc' UX>1 or cpt (u) = j e iux P\ (x) dx (23) 

h 

For the first time characteristic functions were systematically em- 
ployed by A. M. Lyapunov. 

The second formula (23) is nothing hut the Fourier integral of 
the function cpi (u) [see formula (XVII. 141) in which we used another 
notation]. Hence, the probability density p ^ (, x ) is expressed in 
terms of the characteristic function by the formula 

Pi (x) = 2 ^ j (») e~ iux du 
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We now enumerate some simple properties of a characteristic func- 
tion. Formula (23) shows that there must always be | <p 5 («) | ^ 1 
and <p 6 (0) = 1. If r\ = Cil + C 2 (where Ci and C 2 are constants) 
then 

q)„ (u) = M {e iu ( c ^+ c ^} = M = e iC 2 >'cp ? (C t u) 

If £ = I + T) an( i the variables £ and r\ are independent then 
q> 5 (u) = 3\I {e iu (s+ 1 «} = M = M {e”'s} M (e iul i) = cp, (u) % (u) 


Tlie first formula (23) and the last property enable us to deduce 
the expression for the characteristic function of a random variable 
having a binomial distribution (see Sec. 9): 

tpj (n) = (l — /> + Pe iu ) n 

Now passing to the limit we obtain the characteristic function of 
a random variable distributed according to the Poisson law: 

(Pl (it) = exp ( — a -f ae ut ) 

For the case of a uniform distribution (see Sec. 10) we obtain 


<P5 («) ■ 


(gi&tt — eiau) 

i ( b — a) u 


For our further aims it is necessary to find the Fourier- transform 
of the function / ( x ) = exp (— x~). Applying formula (XVII. 138) 
we obtain 


~ If 

f (k) = \ exp ( — x 2 — ikx) dx = 

= 2F ex P(-T-) j exp[-(* + 4) 2 ]dz 


(Check it up!) But the last integral in fact does not depend on k. 
Indeed, denoting the integral by I {k) and differentiating we obtain 

-jj-= — j exp[— (i + i-j) ]2 (i + i-j) Y dx = 

=t«p[-(*+4) 2 ]£1=° 

(why is it so?). Consequently, I (k) — I (0) = j exp (— x 2 ) dx = 

= Vn. Thus, finally we get / (k) = exp ( ~ j) ■ From this, 

with the help of property 3 in Sec. XVII.33, we conclude that the 
rourier transform of the function exp (—ax 2 ) (where a >0) is the 

function — ~ exp ( — — ) 

2~\/n a P V 4 a) 
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Now we can readily determine the characteristic function of a 
random variable distributed according to the Gaussian law (see 
Sec. 10). Let us first take the case a = 0. Formula (23) expressing 
the Fourier inverse transform of the function p% (x), it is sufficient 
to multiply our result by 2n which yields 


!r 2 <’*)-“p(-- T) 


To investigate the general case when a =#= 0 we can add a to the 
above random variable for which the characteristic function has 
just been computed. Then denoting the new variable by the same 
letter £ and taking advantage of the properties of characteristic 
functions enumerated above we finally obtain 

<Pl ( u ) = e iau exp ^ g — ) = ex P ( iau — 1 


In particular, this result implies a remarkable consequence. Let 
|i and £ 2 he two independent random variables distributed accor- 
ding to the normal law with the parameters a lt Gi and a 2 , cr 2 , res- 
pectively. Then, for the variable £ = £ t + S 2 * we obtain 

<P? (u) = <Pii (w) C P| 2 (w) = exp ^iai« — — y— ) exp (ia 2 u — = 


= exp [i («[ + a 2 ) u — ] 


Thus, we have again arrived at the normal law with the parameters 
a = ai + cc 2 and c = ]/” a 2 -{- a 2 . The invariance of the normal 
law with respect to the addition of random variables is one of the 
basic properties of the law which accounts for its being so widely 
spread. Among the probability distributions of discrete random 
variables, the Poisson law possesses this property. 


§ 4. Applications of the Normal Law 

18. The Normal Law as the Limiting One. We now investigate 
the behaviour of the binomial law (see Sec. 9) when P remains con- 
stant and n-*-o o. A random variable £ <n> distributed according 
to the binomial law has the mean a n = nP and the standard deviation 
cr n = Yn Y P (1 — P) (see Secs. 14 and 15) and hence we have 
a n -> 00 and a n — y 00 for n—> 00 . Thus, we see that £ <n> “spreads” 
over the whole :r-axis in the limit. This makes it difficult to inve- 
stigate the behaviour of £ <n> directly. It is therefore convenient 
to perform a linear transformation of the variable £ (n) so that the 
mean value should become equal to zero and the standard deviation 
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become equal to unity after the transformation has been carried 
out. A transformation of this kind is called the standardization (or 
normalization) of the variable | <n> , and it is expressed by the follow- 
ing simple formula: 

U 7l 


The variable T| <n, is known as the standardized (normalized) variable 
corresponding to the random variable £ <7I) . 

There is a remarkable theorem referred to as the De Moivre- 
Laplace theorem which states that the law of distribution of the 
above standardized random variable tends to the normal law when 
n-y- oo. (P. Laplace, 1749-1827, a famous French astronomer, 
physicist and mathematician.) 

The theorem is proved as follows. By Sec. 17, we have 


_.on / i—\ n 

q> n<n>(u) = e ,<Tn \1 — P + Pe°n) = 

- ( - i [ 1 - P + P rep ( ~J“ t _ p) ) ]" - 

= {exp(-(/ irTI ^ ra )fl-P + Pexp (y==§=-)]} 

Expanding the expression in the curly brackets in powers of - 
we obtain 


n 


1 


X 




f(i n / p 

u FuZ i y 

ll 1 l V n (1 — P) 

3(1 ] in 

U 2n(l-P) + •••] ' 

“ 2 - m 

V ~\/ nP (1 — P) 

2nP (1 — P) 


X 


■={ 1 — &+"-}"-^r ex p(“4) 


(check up the calculations!). Thus we have arrived at the characte- 
ristic function of the normal law (see Sec. 16) with the parameters 
a = 0 and 0 = 1. 

In Sec. 14 we mentioned that a variable distributed according 
to the binomial law is the sum of n independent random summands 
with the same simplest law of probability distribution. But it turns 
out that the normal law is obtained in the limit for any initial law 
of distribution (of course, except a deterministic law). 

Indeed, for the sake of simplicity, let us suppose that we have 
an initial law of distribution with the characteristic function cp 0 ( u ) 
and with the zero mean value, the variance being equal to unity. 
These limitations are inessential because in the general case we carry 
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■out the normalization. Then formula (23) implies cp^ (0) = 0 and 
<p" (0) = — 1 etc., and hence, on the basis of Taylor’s formula, we 

U 2 

have (po (u) =1 tj- + • • • • Using the notation similar to the 

above we find 


m = h (^ )]”=[• -s+ “p ( -4) 

It turns out that the condition that the laws of distribution of 
the summands should be the same is also inessential. For instance, 
A. M. Lyapunov proved that the law of distribution for the standar- 
dized sum of independent random summands £ t , £ 2 > • • •, \ n is 
also close to the Gaussian law when n is large if the ratio 


SM|^-^| 3 :(S DhY 

h=l k=l 1 


(a,. — Mlh) 


is small. This condition is violated if the variance of a small number 
of summands considerably exceeds the variance of the rest. In this 
case the latter summands do not contribute to the whole result, in 
the limit, after the standardization has been performed. Lyapunov’s 
condition is also violated in some other special cases, for instance, 
in the case leading to the Poisson law (check up this assertion!). 

If the standardization results in a normal distribution we obvious- 
ly have a normally distributed variable before the standardization 
but with an arbitrary mean value and variance. Hence, we can say 
that the sum of many independent random summands is normally 
distributed irrespective of the laws of distribution of the summands. 
The exceptions to the rule are the cases enumerated in the fore- 
going paragraph. Here lies the main cause making the Gaussian 
law so important. 

In particular, it is usually assumed that the random errors of a 
measurement obey the normal law. Actually, as a rule, such an error 
results from mutual superposition of a great many small indepen- 
dent errors which cannot be taken into account separately. It is 
this fact that leads to the assumption that the Gaussian law is appli- 
cable here. 

19. Confidence Interval. We now come back to the problem of 
tossing a coin which was considered in Sec. 1. It is clear that if the 
coin comes up heads 200 times in 1000 tosses we have every reason 
to suspect that something is wrong. Shall we say the same if we 
have 400 heads or 450 heads? In other words, shall we consider it 
to be unusual if the relative frequency of the coin coming up heads 
is 0.4 or 0.45? Now we are able to answer a question of this type. 

Let us take a more general situation. Consider a random variable 
£ (in the above example the role of § was played by the number of 
occurrences of heads in one toss). Suppose that n trials have been 
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performed, and let c assume the values x u x z , .... in these 
trials. Let us designate the arithmetic mean value of the- quanti- 
ties a*i. a* 2 . .... Xu b> x f L 

X° ly = (Xi 4- x 2 -f . . . -f x n ) (24) 

On the basis of Sec. 13. we can assert that x {n) -*• | for n oo . This 
is the so-called law of large numbers (in our course we have taken 
the law of large numbers as the foundation of the definition of the 
mean f of a random quantity |). But what is the rate at which x m 

approaches t, as n — ^ oo? 

Let us consider the random variable 


where all the summands are independent, each c t - (i — 1. 2, . . n) 
being distributed according to the same law of probabilitj- distri- 
bution as c. Then quantity (24) is one of the possible values of the 
variable But. on the basis of Sec. IS, we can regard the random 
variable H l ' l> as being distributed according to the Gaussian law 
for large n. Besides, we have 




• +!*) = 5 


and 


Do< n) = -4 nDo - - D| = JL D ? 

n- * n 3 n * 


where a: is the standard deviation of the variable |. Therefore, 
by formula (14). we can put 

For the sake of simplicity, let us restrict ourselves to the case of 
symmetric intervals of the form | & <n > — g j ^ 6. Then we obtain 

P{|l <n> -Il<5} = 0)(-^) (25) 

' % 1 

It is formula (25) that provides the answer to the problem stated 
at the beginning of this section. In practical applications it can be 
used for any values of n. Here we give a rough table of the values of 
the function <P (f): 


t 

0.0 

0.1 

0.2 

0.3 

0.4 

0.5 

0.6 

0.7 

O.S 

0.9 

1.0 

1.1 


0.00 

0.08 

0.16 

0.24 

0.31 

0.38 

0.45 

0.52 

0.5 S 

0.63 

0.68 

0.73 


4S-0141 
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t 

B 

B 

1.4 

fl 

B 

B 

1.8 

1.9 

2.0 

B 

2.2 

2.3 

2.4 

(Pit) 

0.77 

0.81 

0.84 

0.87 

0.89 

0.91 

0.93 

0.94 

0.95 

0.96 

0.97 

0.98 

0.98 


t 

2.6 

3.0 

3.3 

3.9 

4.4 

4.9 

5.3 

oo 

©(0 

0.99 

0.997 

0.999 

0.9999 

0.99999 

0.999999 

0.9999999 

B 


For large values of t the corresponding values of (D ( t ) can be found 
with a great accuracy by means of the asymptotically convergent 
series (see Sec. XVII. 19): 

where 


y(t) 


1 1 .1*3 1-3-5 

t I s i (5 /7 I • ' ' 


Now let us return to the problem of tossing the coin. Suppose we 
agree that an event whose probability is less than 0.01 will be con- 
sidered to be highly improbable. Then choosing 8 so that the right- 
hand side of (25) should become equal to 0.99 we obtain the corres- 
ponding interval which includes all those values which we regard 

as probable. In our case we obtain = 2.6. We have n — 1000 

and Oj = 0.5, and hence 6 = 0.041. Let us denote the number of 
occurrences of heads by M. Then we get a confidence interval 459 ^ 
^ M 541 for M. Thus we have obtained confidence limits for 
the number M which we can guarantee disregarding those events 
which are considered to be practically impossible relative to the 
criterion we have chosen (see Sec. 7). 

20. Data Processing. In Sec. 19 we considered a random variable 
1 whose law of probability distribution was regarded as being known. 
But in practical problems we most often encounter cases when the 
law of probability distribution is not known beforehand and when 
the numerical characteristics of the random variable | under inve- 
stigation should be determined on the basis of the outcomes of the 
trials. For instance, this is the case when it is necessary to find the 
mean value of a certain parameter characterizing some property 
(such as strength or longevity and the like) of a manufactured article 
belonging to a lot ( general population) on the basis of inspecting 
a sample of a number of articles randomly drawn from the lot. 
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The same problem arises when we repeatedly measure some un- 
known physical quantity because the result of every measurement 
is a random quantity due to the unavoidable random errors. When 
there are no systematic errors the arithmetic mean of the experi- 
mental data is approximately taken as the precise value of the 
quantity in question and so on. 

Suppose that we have performed n trials and that a random variable 
l has taken the corresponding numerical walues x it x z , .... x n . 
Then our previous investigation indicates that it is the arithmetic 
mean value x (n> of the results we have obtained [defined by formu- 
la (24)] that should be taken as an approximate value of g. Besides, 
it turns out that an approximate value of the standard deviation 
CF| can be chosen as 



i=i 


To prove (26) we use the notation introduced in Sec. 19 and addi- 
1 n 

tionally put £ n = — — r 2 (ii — | <n> ) 2 - Computing the mean value 

1 i=l 

of t, n we find 

i = i i=i 

(check up the result!). Now, representing each g; in the form g,- = 
= (g/ — g) + g and calculating we obtain 

JZTI [«D| + «I 2 — ^ («D| -r n 2 J 2 ) J *= Dg = g§ 

Besides, computing Dg n we find that Dg n ->■ 0 when n — oo. (Let 
the reader perform the calculations.) But the radicand in (26) is 
nothing but a value of the random variable t n . This implies (26). 

Now we can set a criterion according to which random events will 
be considered to be practically impossible (see Sec. 7) and then, 
applying formula (25), construct a confidence interval for g. For 
instance, let us disregard the events whose probability is less than 

0.003. Then (25) indicates that we can put — 3. Thus, we get 
the following confidence interval for g: 

3o E - 3cr- 

x < n >- -± <g<a:<" > + (27) 

The confidence limits thus obtained are guaranteed with a probabi- 
lity of 0.997. Formula (27) is widely applied to practical problems. 
The value of Oj entering into (27) can be taken from formula (26). 

48 * 
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Let us discuss a consequence of formula (27). Suppose that we 
have performed two independent series of n trials for one and the 
same random variable 5 and that this has resulted in the mean 
values x m and x m . Then estimation (27) holds for both values with 

the probability (0.997) 2 = 0.994, and hence | x m — x m | ^ 

y n 

with this probability. This means that the empirical means x m 
assume some specified values with a great probability in any series 
of trials when the number of trials n is sufficiently large. This result 
is quite clear in its qualitative aspect because it is obviously im- 
plied by the definition of the mean. 

Let us be given two correlated random variables | and p and let 
■the regression of both variables be linear (see Sec. 16). If n trials 
are performed we obtain n pairs of the values assumed by the variab- 
les: (xi, yi), (x 2 , y z ), ■ ■ (x n , TJn)- Then we can pose the problem 
of approximating the regression line of p on £, on the basis of the 
•data. It turns out that the solution of this problem directly leads 
to the method of least squares described in Sec. XII.8. 

The theory of probability was originated in the 17th century in 
connection with investigating regularities of games of chance. It 
was thoroughly developed in the 19th and 20th centuries and is now 
an important branch of mathematics which has many applications 
in various divisions of science. Many prominent mathematicians 
took part in creating the theory of probability. Among Russian 
scientists who contributed much to the theory we can mention 
P. L. Chebyshev, A. M. Lyapunov, A. A. Markov, S. N. Bernstein 
(1880-1968), A. N. Kolmogorov, Yu. V. Linnik and others. There 
is a mathematical science called mathematical statistics which is 
•directly related to the theory of probability. Mathematical statis- 
tics deals with problems of processing statistical data. There are 
many courses on the theory of probability, mathematical statistics 
and their applications. For a beginner we recommend II], 117], 
141], [45] and [48]. 



CHAPTER XIX 


Computers 


The simplest computing devices such as a slide rule, an abacus, 
an arithmometer and tahles are well known and widely applied to 
practical problems. But these devices do not meet the requirements 
of modern science, engineering and economics. There are many im- 
portant problems which are solvable, in principle, and whose solu- 
tion cannot be practically obtained by means of the above devices 
because this would take too much time. Therefore in applying these 
simplest tools we usually disregard many essential factors in order 
to simplify the calculations, and this often leads to quantitative 
and even to essential qualitative errors. 

The need for more effective computing devices and the achieve- 
ments of modern technology have led to a revolution in data hand- 
ling and problem solving practice. This has resulted in inventing 
and constructing high-speed electronic computers. The intensive 
application of these machines has made it possible to solve many 
important problems and to obtain fruitful results by means of 
introducing mathematical methods in a number of new fields of 
human activities. There is no doubt that the development of modern 
calculating devices and the extension of their application will result 
in radical reorganization of scientific research, engineering work, 
economics, control, service and so on. 

§ 1. Two Classes of Computers 

There are two basic methods of representing mathematical quan- 
tities entering into calculations. The first method is based on a direct 
representation of mathematical quantities by some physical quan- 
tities such as lengths, angles, voltages and the like. To perform 
certain operations on these quantities it is necessary to construct 
a physical system in which the corresponding physical quantities 
are transformed according to the law which describes the transforma- 
tion of the mathematical quantities in question. Computers based 
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We see that this scheme is convenient for studying the way in 
which the variations of the coefficients affect the solution. Modify- 
ing this scheme we can solve many other similar problems. But it 
should be noted that in case the problem in question has several 
solutions this scheme may yield a solution different from the one 
we are interested in. 

Let us take one more example illustrating the integration of a 
differential equation. For instance, let us take an equation of the form 

y" + / (y r ) 4 - y = 4 (*) (y = y (*)) 

The corresponding scheme is shown in Fig. 349. Let the reader verify 
the correctness of the scheme [in doing this it is more convenient 



Fig. 349 

to rewrite the equation in the form y = i|) (x) — y" — / (y ') ]. Here 
the quantit y y is a variable whose variations should be related to the 
variations of the independent variable x. 

We often choose the time t as an independent variable. For this 
case the differentiator (see Fig. 347e) has only one input to which 
the voltage tJj ( t ) is applied. The above scheme for solving the diffe- 
rential equation should he then changed correspondingly. Namely, 
instead of putting in the voltage x we must simply apply the voltage 
tJj ( t ) to the terminal which is represented by the point in Fig. 349. 
This can he performed without calculating (() if we read this 
signal from a graph or from some data unit. Such a scheme is par- 
ticularly convenient when the phj^sical meaning of the problem in- 
dicates that the independent variable is the time t. Then the problem 
can he solved in real time. The solution can he directly transmitted 
(without human intervention) to some other system for a further 
utilization. Many devices intended for automatic control are based 
on this idea. 

The real time simulation makes it possible to replace some expen- 
stve aggregates by computers when testing a complicated device. 
For instance, instead of testing an autopilot in a flight which is 
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dangerous and expensive we can do it in a test stand where an ana- 
logue computer substitutes for a real airplane. It is apparent that 
this computer must precisely simulate the reaction of the airplane 
to the performances of the autopilot. 

The fundamentals of the theory of analogue computers can be 
found in [12] and [30]. 

2. Digital Computers. The first devices of the type of an abacus 
were invented in China as early as 2,000 or 3,000 years B.C. The 
first mechanical computer capable of performing addition and sub- 
traction was constructed in 1642 by the prominent French mathema- 
tician, physicist and philosopher B. Pascal (1623-1662). The modern 
desk calculators (summing machines capable of performing addition 
and subtraction and semiautomatic and automatic arithmometers 
which perform all the four fundamental operations of arithmetics) 
are essentially the perfection of Pascal’s calculator. All these use- 
ful devices cannot work very fast because the speed of performing 
calculations is low and because the input data are entered into such 
a machine by an operator. 

At the beginning of the 20th century the necessity for processing 
a large number of similar data in statistics, book-keeping and finan- 
ces led to the creation of punch card computers (tabulators, sorters 
etc.). The input data to be inserted in such machines are punched 
on cards (punch cards; see Sec. 4). The punch card reader reads the 
cards by means of a set of brushes (which sense the holes punched in 
the cards) and produces the corresponding electric pulses. These 
electric signals make the machine work according to the given pro- 
gram. It can sort the cards, sum up the parameters, accumulate 
the results and perform some other simple operations. These ma- 
chines are very useful but they cannot be applied to more complicated 
calculations. 

As it has been mentioned, the revolution in this field is connected 
•with the appearance of a new class of calculating machines which 
are referred to as universal high-speed digital automatic computers. 
Although the idea of constructing such machines appeared as early 
as the 19th century, it is only the achievements of modern electro- 
nics that have made it possible to realize the idea. The first high- 
speed electronic digital computer ENIAC (Electronic Numerical 
Integrator and Computer) was constructed in the USA in 1943. The 
basic theoretical ideas and principles of constructing such machines 
were formulated by the American mathematician J. von Neumann 
(1903-1957) in 1946. In the USSR the first machines of this type 
were manufactured in 1952. 

The block diagram of an automatic digital computer is shown in 
Fig. 350 which represents the main units and their interconnection. 
In the memory (store or storage) there are a number of locations 
(cells or compartments) each of which can store one number or one 
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instruction (order or command). As we shall see in Sec. 5, the form 
in which an instruction is represented within the machine does not 
differ from that of a nu mb er. Some of these locations store the in- 
formation obtained from the input device before the calculations 
are started whereas the others can he filled in the process of work. 
When the machine performs calculations the contents of the loca- 
tions may change many times, and some of the locations may remain 
empty and may not be used. The control unit interprets the instruc- 
tions given to the machine and sends the numbers (or instructions) 
into the arithmetic unit. The arithmetic unit transforms the numbers 
according to the instructions and then returns them to the memory. 
As the required results are obtained, 
they are automatically printed accor- 
ding to the signals sent l> 3 r the control 
unit, and after the computations have 
been completed the control unit stops 
the computer. (In § 2 we shall describe 
the sequence of the operations in grea- 
ter detail.) 

Any mathematical problem can be Fi E- 350 

solved by means of a universal auto- ^ c ti Sn nt chann S eis n and pr tUe n1 dotted 
matic digital computer provided that lines represent control channels 
a certain algorithm is given. An algo- 
rithm is understood as an accurate assignment defining a step-by- 
step procedure for solving the problem which necessarily leads 
to the required result on the basis of the input data. But there 
are also many non-mathematical problems for which a similar algo- 
rithm can be indicated, and such problems are solvable by means 
of a digital computer. For example, an electronic computer can 
control the process of machining a workpiece of a complicated pro- 
file in a metal-cutting lathe. In this case the input data characterize 
the profile, and the results of the calculations are transformed into 
signals controlling the lathe. The device controlling the flight' of 
an airplane beginning with the take-off and up to the landing at 
a prescribed place operates in a similar way. A digital computer 
can control a manufacturing process and makes it possible to com- 
pletely automatize the process which is particularly important in 
those industries which are hazardous to human health. If the per- 
formance of the process deviates from the initial program the com- 
puter can find the most advantageous solution by comparing diffe- 
rent variants, and it can also check the result. Of course, the neces- 
sary information concerning the real state of the process should he 
entered in such a machine automatically by means of special devices. 
A computer can help in designing an engineering construction be- 
cause it can examine hundreds of variants and choose the best of them 
by applying a certain given criterion. Digital computers are also 
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chosen at the end of each' location and there are six such digits. 
Then the maximal number which can be stored in a location (having 
30 digits) is represented in the following form: 


the sign 
of the 
number 


the binary digits of the number to the 
right of the binary point 


the sign 
of the 
. exponent 


the 

binary 
digits of 
the 

exponent 

i , » , 


(3) 


Hence, this number is equal to (1 — 2 -23 ) 2 31 = 2 31 — 2 8 » 2 31 . 
Accordingly, the minimal positive number which can be represented 
by this method is stored in the form 


0 1 0 


( 4 > 


and thus it is equal to 2 -23 2 31 = 2 -54 (remember that the first 
method gave us the least positive number 2~ 29 ). 

It sometimes occurs that the number of digits in a location is 
insufficient for guaranteeing the accuracy needed. In such cases it- 
is possible to use a special program which takes two consecutive 
locations (the first and the second, the third and the fourth etc.) 
for each number (this is called the double precision method). 

The input operation, that is “writing” numbers, is realized in 
different ways depending on the type of computer. The numbers to- 
be introduced into the input unit of a computer (see Sec. 2) are pun- 
ched on a special tape (punch tape) or on a deck of special cards 
(punch cards). In the first case to each location there corresponds 
a certain part of the tape and in the second case a certain row on a 
card. The length of a row corresponds to the number of digits con- 
tained in a location; a hole punched in the card corresponds to the 
digit 1 and an unpunched place corresponds to the digit 0. For 
instance, if the numbers (2), (3) and (4) (we mean the decimal form 
of the numbers here) follow in succession in the input program the 
corresponding punched card contains a part of the form shown in 
lug. 351 (check it up!). The holes are punched by means of a special 
perforator (which is not directly connected with the machine) before 
the machine is started. 

A programmer who composes the program (routine) for solving 
a problem often writes the instructions (see Sec. 5) in the octal 
system and the input numbers in the decimal system. Then after 
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tlie program is punched on a keyboard of the perforator (which re- 
sembles a cash register used in shops) the instructions are punched 
on cards in the binary form, according to the program, and the input 
data are punched in the binary-decimal code. The conversion of 
numbers to the binary system is carried out according to the instruc- 
tions written by the programmer in the program, the operation being 
performed within the machine at the beginning of its work by means 
of a special subprogram (subroutine) furnished into the machine. 

Punch cards are in some respects more convenient than a pun- 
ch tape because a program can be very extensive and can occupy 
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Fig. 351 

a deck of cards or a reel of tape, and it is much easier to replace 
several cards in the deck in case some mistakes have been found 
than to insert the corrections into the tape. 

The representation of numbers stored in the memory of a computer 
(and transformed in the process of calculations) is realized in the 
machine by means of two state elements. Each location is a set of 
such elements, the number of the elements beiDg equal to the number 
of binary digits in the location. A feature of these elements is that 
each of them can be in one of the two states. One of these states is 
regarded as a physical embodiment of 0 and the other as a physical 
embodiment of 1, and thus each binary place in a location can carry 
either 0 or 1. The construction of the elements varies with the model 
of a computer hut in all cases they must be inertialess (i.e. they 
must pass from one state to the other in a trice) and stable (i.e. they 
must indefinitely long remain in each of the states unless there is 
a transition signal). An element can be realized as a trigger consis- 
ting of two triodes whose cathodes are connected together and whose 
anodes and grids are connected crosswise. In such a circuit either 
one triode or the other is always cut off, and the transition from 
one state into the other is performed under the action of a voltage 
pulse. Charged and uncharged parts of a dielectric plate can also 
serve as the elements, the transition from one state to the other 
being performed by means of an electron-beam tube similar to that 
used in a television set. In some computers magnetic drums are 
used as storage devices. Such a drum rapidly revolves under a mag- 
netizing head similar to the one used in a tape recorder. The elements 
in such a device are magnetized and unmagnetized portions on the 
surface of a drum. Ferrite cores (magnetic cores) which can change 
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the direction of their magnetization when a pulse of current passes 
through the winding and some other devices are also used as such 
elements. 

A number can be transferred from a location of the storage of a 
computer to the arithmetic unit and in the opposite direction along 
a single channel with each digit occurring serially in time sequence, 
or more than one digit is simultaneously transferred in parallel by 
means of a system of channels whose number can equal the number 
of digits in the location. Accordingly, the computers of the first 
type are referred to as being serial in storage access and those of the 
second type are said to have a parallel storage system. The latter 
method increases the speed of performing operations but. at the same 
time it complicates the construction of a computer. 

The greater the number of locations in the storage, the greater 
the amount of information that can be entered into the machine. 
Therefore, besides a high-speed internal memory, a computer is 
usually equipped with an external memory in order to increase 
storage capacity. An external memory is usually realized as a magne- 
tic tape from which numbers stored there can be transferred, by 
means of a special device, to the internal memory. A magnetic tape 
can carry millions of locations but the speed of recording informa- 
tion on the tape and reading it from the tape is less than that of the 
internal memory because passing the tape under the magnetizing 
head requires additional time. 

The results of calculations are converted to the decimal system 
by means of a special subroutine and then printed on paper tape 
or stored (in the external memory) to be used for further calcula- 
tions. 

Besides handling numerical data a machine can also process some 
other types of information (for instance, expressed in words) provi- 
ded it has been coded beforehand by means of binary numbers. It 
is possible to compile a program which makes a special device decode 
the results after the operation has been completed, i.e. convert the 
output data to the form which is natural for this particular type 
of information. For instance, this is the case in machine translation 
from one language into another. 

5. Instructions. A machine performs operations on numbers stored 
in the memory only according to instructions which are also entered 
into the memory before the computations are started and which are 
stored in the binary form not differing from that of numbers. There 
are one-address, two-address and three-address instructions which are 
used depending on the type of the machine. The three-address in- 
structions are especially convenient for programming, and we shall 
consider them here. 

A location storing a three-address instruction can be thought of 
as being divided into four parts of specified length. For instance, 

49-0141 
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let these parts contain 3, 9, 9 and 9 binary digits, respectively (as 
above, we consider locations with 30 digits). The contents of these 
parts are the following: 

(1) The first part termed the operation part contains the code of 
the operation and tells the machine what to do. 

(2) The second part carries the address (i.e. the number of the loca- 
tion) of the first number taking part in the operation. 

(3) The third part contains the address of the second number in- 
volved in the operation. 

(4) The fourth part stores the address of the location to which the 
result of the operation must be transferred. 

All the locations are numbered in the natural order. If the memory 
contains 512 locations nine binary digits are sufficient to write the 
addresses (why?). Suppose that the number 1 is the code of the ope- 
ration of addition. Then if it is necessary to add together the numbers 
stored in the 417th and 73rd locations and to transfer the result„ 
to the 646th location the corresponding instruction must have the 
form 

001 100001111 000111011 110100110 


parts: 



( 5 ) 


The reader should verify this form taking into account the above 
division of locations carrying instructions and the fact that the 
numbers of locations are coded in the octal system (that is the above 
numbers 417, 73 and 646 are octal). After the instruction has been 
executed, the contents of all the locations of the memory (including 
the 417th and the 73rd locations) remain the same as before except 
the 646th location in which its former contents are replaced by the 
sum of the numbers contained in the locations with the addresses 
417 and 73. 

Instruction (5) is punched on a punch tape or on a punch card 
and then read in the input unit and transferred to one of the loca- 
tions in the memory where it is stored in the same way as numbers. 
The distinction between numbers and instructions only appears 
in the process of operation of the machine because if the program 
is compiled correctly the control unit must receive as instructions 
only those signals which are sent from the locations in which the 
instructions have been placed. 

A one-address instruction is placed in a location divided into 
two parts which contain the code of an operation and the address 
of the location. To perform the above operation we need three one- 
address instructions: the first instruction sends the number from 
the 417th location into the adder, the second makes the adder sum 
up the former number and the number stored in the 73rd location 
and the execution of the last instruction results in transferring the 
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outcome of the addition to the 646th location. A two-address in- 
struction consists of three parts which carry the code of an operation 
and the addresses of the numbers on which the operation must he 
performed. After the operation has been performed the result is 
sent to the second address. 

In what follows we shall consider only three-address instructions. 
For the sake of simplicity we shall write the signs of operations 
instead of their codes (for instance, the sign + will designate the 
code of the operation of addition) and decimal numbers of locations 
instead of their octal numbers. 

In most computers instructions are executed serially, in a conse- 
cutive order, unless the program given to the machine contains 
transfer instructions which will be discussed in Sec. 6. Before the! 
machine is started, the program (which is a certain sequence of 
instructions) and initial data (that is an array of numbers on which 
the corresponding operations must be performed) are read from pun- 
ch cards or punch tape and fed into the memory. Then the con- 
trol unit takes the contents of the first memory location as an in- 
struction and sends the corresponding signals to the arithmetic unit 
which performs the operations. After that the control unit takes, 
as the next instruction, the contents of the second location and so 
on with the exception of transfer instructions mentioned above. 
When a transfer instruction has been executed the control is trans- 
ferred not to the subsequent location but to the location whose address 
is contained in the instruction. Some instructions cause the machine 
to print the contents of some locations but after the realization of 
such an instruction the machine also passes to the next location. 
(By the way, printing takes more time than performing arithmetic 
operations, and therefore the speed of computations decreases when 
the program makes the machine print very often.) This step-by- 
step procedure goes on until the control unit receives the stop in- 
struction after which the machine is brought to a stop. The machine 
is also automatically stopped if the control unit takes from the pro- 
gram an instruction that cannot be executed (for instance, if a 
number of which the square root must be extracted turns out to be 
negative). The same will be in case an overflow occurs, that is if 
the calculations result in a number which is so large that it exceeds 
the capacity of the location and cannot be written there (see Sec. 4). 
The control panel is equipped with a special device which makes it 
possible to check the contents of any location and to insert additio- 
nal data at any moment of the machine operation process. 

Each type of automatic digital computer has its own set of in- 
structions and system of coding. In Sec. 6 we shall give several 
examples illustrating the main principles of programming. The 
reader should take into account that these examples are given only 
as an illustration and that in real computers these principles are 

49 * 
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realized more economically. For more detail the reader is referred 
to 18], [18], [201, [21], [25] and [27]. 

The composition of an extensive program is complicated work 
which often takes much time and requires much experience. 

6. Examples of Programming. Let it he necessary to compute the 
solution of the system of equations of the first degree 

ax + by = m 1 
cx + dy = n ) 


where the numerical values of the quantities a, b, c, d, m and n are 
considered to be given. According to formulas (VI. 2), the sought- 
for solution is expressed as 


md — bn an — me 

ad — be ' y ad — be 


( 6 ) 


We now place the necessary instructions in locations with addresses 
(numbers) 1,2,.... The total number of the required instructions 
is yet unknown. Let the six input parameters a, b, c, d, m and n 
be stored in locations having the numbers a + 1, a -f- 2, a + 3, 
a + 4, a + 5' and a -f 6, respectively. The value of a will be spe- 
cified later on. Several subsequent locations will be used for storing 
intermediate results of the calculations. The numbers and instruc- 
tions stored in these locations before the computations are started 
do not matter because when a number is written in a location its 
former contents are automatically erased. The other locations will 
not he used in compiling the program and calculating the result. 
Let us begin with computing the first numerator in (6). To compute 
md we write the instruction 

(1) x a + 5 a-f-4 a-(-7 

(here we shall write the serial number in front of each of the instruc- 
tions but it should be noted that these numbers are not in fact pun- 
ched on cards or on tape). After the instruction has been executed 
the number md is placed in location (a -f- 7). The next instruction 
is of the form 

(2) x a + 2 cc-j-6 a -r 8 

and its execution results in the appearance of the number bn in lo- 
cation (a 4- 8). Next, the number bn [stored in location (a -j- 8)] 
must be subtracted from the number md [location (a + 7)]. Since 
we no longer need these numbers the result can be written in location 
(oc -f 7) by means of the following instruction: 

(3) — a + 7 a + 8 a + 7 

In this example we could have used location (a -j- 9) for storing 
the result md — bn but in more complicated programs it is often 
necessary to economize locations. 
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Now we similarly put down tbe instructions wliicli lead to the 
computation of the denominator of the fractions entering into (6); 

(4) x a + 1 ct-}-4 a + 8 

(5) X a + 2 a-j-3 a + 9 

(6) — oc — {— S a -j- 9 a + 8 

Further, the numerator stored in location (a + 7) must he divided 
by the denominator stored in location (a + 8), and the result should 
be printed: 

(7) : a + 7 a + 8 a+7 

(8) Print a + 7 

The execution of the last instruction results in printing the contents 
of location (a + 7), i.e. the desired value of x, and then the machine 
proceeds to execute the subsequent instruction. Taking into account 
that the denominator ad — be has already been computed and is 
stored in location (a + 8) we similarly write the instructions which 
result in computing y and completing the solution of the problem: 


(9) 

X 

a+1 

a + 6 

a + ( 

(10) 

X 

a + 5 

a + 3 

a + 9 

(11) 

— 

a -f- / 

a + 9 

a + 7 

(12) 

1 

a + 7 

a + 8 

a + 7 

(13) 

Print 

a + 7 



(14) 

Stop 





The 14th instruction stops the machine. Thus, our program con- 
tains 14 instructions and hence we can put a = 14. Then the whole 
program will occupy 20 locations and will have the form 


(1) X 

19 

18 

21 

(2) x 

16 

20 

22 

(3) _ 

21 

22 

21 

(4) X 

15 

18 

22 

(5) X 

16 

17 

23 

(6) - 

22 

23 

22 

(7) : 

21 

22 

21 

(8) Print 

21 



(9) X 

15 

20 

21 

(10) x 

19 

17 

23 

(11) - 

21 

23 

21 

(12) : 

21 

22 

21 

(13) Print 

21 



(14) Stop 




(15) a 





(16) b 

(17) c 

(18) d 

(19) m 

(20) R 
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2999 with 2002 and to pass to the execution of the first instruction 
again, etc. Only after the recurrent addition of unity to the contents 
of location (a + 1) results in the appearance of 3000 and the inverse 
of 3080 is computed and printed, the fourth instruction will trans- 
fer the control to the next instruction since then we shall have 
(a -f 3)' = 2999 < (a -f- 1)' = 3000. Then the machine must be 
stopped because all the desired results wall have been printed, and 
therefore the fifth instruction is 

(5) Stop 

Thus, we can put a = 5, and hence the -whole program will have 
the following form: 

(1) + 6 7 6 

(2) : 7 6 9 

(3) Print 9 

(4) Transfer of control 8 6 1 

(5) Stop 

(6) 2000 

(7) 1 

(8) 2999 

We have used only one location 9 for storing the intermediate results 
hut at the same time the contents of location 6 have been changed 
1000 times from 2000 to 3000 (with step 1) in the process of the cal- 
culations. 

We now consider a variant of the above program which results 
not in printing the reciprocals but in placing them into the locations 
of the memory with the numbers ranging from 10 to 1009. These 
stored numbers can be used for some further calculations; of course, 
we suppose that the memory capacity makes it possible to place the 
numbers. This program has the following form: 

(1) + 6 7 6 

(2) + 3 9 3 

(3) : 7 6 9 

(4) Transfer of control 8 6 1 

(5) Stop 

( 6 ) 2000 

(7) 1 

(8) 2999 

(9) (In this location 1 is the last binary digit, and all the other 
digits are equal to 0) 

Here we have placed an auxiliary number in location 9 which has 
no quantitative meaning and serves only for modifying instructions. 
This is quite a new operation which is performed in the following 
way in our program: after the second instruction has been executed 
the first time, the third instruction is sent to the arithmetic unit 
where it is transformed into 

: 7 6 10 
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and. the same part of a program several times. Loops are formed by 
means of the conditional transfer instruction which is of the form 

Transfer of control N z N 3 

{It is apparent that instead of the words “transfer of control” we 
must in fact- write the code of the operation.) There are many vari- 
ants of the realization of the instruction. For definiteness, we assume 
the following interpretation of the above instruction. Let the numbers 
jYj, jVoand N 3 designate the addresses of the corresponding locations 
and let {Ni)', {Nz)' and (i\ r 3 )' be the contents of the locations. Sup- 
pose that the instruction causes the machine to compare the number 
(jVj )' with the number ( N 2 )' and proceed to execute the subsequent 
instruction if (iYj)' <C (A r 2 )' or the instruction stored in location 
N 3 if (Ni)' ^ {NzY, the contents of the locations remaining unchan- 
ged in either case. In particular, the instruction 

Transfer of control 1 1 N 3 

causes the machine to execute the instruction stored in location N 3 
after the above instruction has been read by the control unit. This 
is the so-called unconditional transfer instruction. 

As an example, let us compose a program for printing the table 
of reciprocals of 1000 successive natural numbers taken, for in- 
stance, from 2001 to 3000. Such a program formed without the trans- 
fer instruction would be very extensive, but it becomes in fact 
rather short if the instruction is used. Let us place the number 2000 
in location {a -j- 1) and the number 1 in location (a -j- 2). Besides, 
let us place the number 2999 in location ( a + 3) (soon we shall see 
why the number is introduced). Let the first instruction have the form 

(1) -f a -pi a~2 a-}-l 

(we again place the instructions in locations 1, 2, . . .). After it 
lias been executed the number 2001 substitutes for the number 2000 
in location (a -j- 1). The next two instructions result in calculating 
the inverse of 2001 and in printing it: 

(2) : a -{-2 a-f-1 a + 4 

(3) Print a -{-4 

Now we write the instruction 

(4) Transfer of control a + 3 a -f 1 1 

This instruction compares the number stored in location ( a -f- 1) 
with the number 2999 and transfers the control back to location 1 
since the number 2001 placed in location (a + 1) is less than 2999: 
(cc -j- 1) — 2001 < (a -j- 3)' = 2999. Thus the machine again exe- 
cutes the first instruction which results in the appearance of the 
number 2002 instead of 2001 in location (a + 1). Then the instruc- 
tions (2) and (3) are executed and thus the inverse of 2002 is compil- 
ed and printed. Next, taking the fourth instruction from the storage 
and executing it the control unit causes the machine to compare 
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2999 with 2002 and to pass to the execution of the first instruction 
again, etc. Only after the recurrent addition of unity to the contents 
of location (a -f- 1) results in the appearance of 3000 and the inverse 
of 3060 is computed and printed, the fourth instruction -will trans- 
fer the control to the next instruction since then v/e shall have 
(cc -f- 3)' = 2999 < (a -f- 1)' = 3000. Then the machine must be 
stopped because all the desired results will have been printed, and 
therefore the fifth instruction is 

(5) Stop 

Thus, we can put a — 5, and hence the whole program will have 
the following form: 


(1) 

» 

1 

6 

7 

6 

(2) 

l 

7 

6 

9 

(3) 

Print 

9 



(4) 

Transfer of control 

8 

6 

1 

(5) 

Stop 




(6) 

2000 




(7) 

1 




(8) 

2999 





We have used only one location 9 for storing the intermediate results 
but at the same time the contents of location 6 have been changed 
1000 times from 2000 to 3000 (with step 1) in the process of the cal- 
culations. 

We now consider a variant of the above program which results 
not in printing the reciprocals but in placing them into the locations 
of the memory with the numbers ranging from 10 to 1009. These 
stored numbers can be used for some further calculations; of course, 
we suppose that the memory capacity makes it possible to place the 
numbers. This program has the following form: 


(1) 

_ J_ 

6 

7 

6 

(2) 

1 

3 

9 

O 

(3) 

: 

7 

6 

9 

(4) 

Transfer of control 

8 

6 

1 

(5) 

Stop 




(6) 

2000 




(7) 

1 




(8) 

2999 




(9) 

(In this location 1 is 

the 

: last binary digit, and all the other 


digits are equal to 0) 





Here we have placed an auxiliary number in location 9 which has 


no quantitative meaning and serves only for modifying instructions. 
This is quite a new operation which is performed in the following 
way in our program: after the second instruction has been executed 
the first time, the third instruction is sent to the arithmetic unit 
where it is transformed into 

: 7 6 10 
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[By the way, it should he noted that the sign + in instructions (1) 
and (2) designates the operations of addition which are performed 
according to different rules and therefore they have different codes. 
For more detail the reader is referred to the hooks enumerated above.] 
The third instruction taking the above form, its execution results 

in placing the number in location 10. After the repeated execu- 
tion of the second instruction, the” third instruction is modified 
again and takes the form 

: 7 6 II 

Therefore the second execution of instruction (3) causes the control 
unit to send the number ^5 to location 11 and so on. Thus we have 

encountered here the operation of modifying an address entering 
into an instruction. 

Hence, it is possible to perform operations on instructions thus 
automatically mod^ing them in the process of work of the machine. 
This obviously extends the application of digital computers. 

The conditional transfer instruction is used not only in employing 
loops but also when it is necessary to introduce the branching in- 
struction which causes the computer to perform different sequences 
of operations depending on some circumstances unknown beforehand. 
For example, suppose that a number a must appear in a certain loca- 
tion p, and let it be necessary to retain it unchanged if a 0 and 
square it and store the result in the same location p if a<0. In the 
zeroth location the number 0 is usually placed, and therefore w r e can 
realize the desired procedure by writing in the corresponding place 
of the program the following instructions: 


(k) Transfer of control p 0 k -j-2 
(*+l) X p p p 

Branching is then performed automatically, and when the program 
has been executed we shall not even know which variant has been 
realized unless a special instruction causing the machine to output 
the information concerning this procedure is introduced into the 
program. 

Finally, let us consider an example of a program in which the 
total number of operations is not set beforehand. Let it be neces- 
sary to solve the cubic equation 

[a: = O.'ia; 3 + lj 

hj applying the iterative method (see Sec. V.3) and beginning with 
e initial approximation rc 0 = 0. To do this we place the number 
m location (a -f- 1) (this practically means that the corresnondina - 
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location (a + 2) and the number 1 in location (a + 3). The succes- 
sive approximations appearing in the process of calculations will 
be placed in location (a + 1). 

After a certain approximation has been computed the calculations 
yielding the next approximation can be carried out according to the 
following instructions: 


(1) 

X 

a + l 

a + l 

a + 4 

(2) 

X 

a + 4 

a + l 

a + 4 

(3) 

X 

a + 2 

a + 4 

a + 4 

(4) 

"T~ 

a + 4 

a + 3 

a + 4 


Tlius. the new approximation will be placed in location (a + 4). 
It must be compared with the preceding approximation stored in 
location (a + 1). If the approximations differ the result must be 
transferred to location (a + 1) and then the iteration should be 
repeated. If the approximations coincide the result must be printed 
and the machine must bo stopped. This can be realized by means 
of the following instructions (check!): 

(5) | — | a-f-1 a + 4 a + l 

[the execution of 'this instruction results in sending the absolute 
value of the difference between the contents of locations (a -f 1) 
and (a + 4) to location (a + 1)] 

(6) Transfer of control 0 a -pi 9 

(7) a — 4 0 a+l 

(8) Transfer of control 111 

(9) Print j a + 4 

(10) Stop 

Consequently, we can put a = 10 and write down the whole pro- 
gram which occupies 13 locations. The program will cause the ma- 
chine to perform the iterations until a subsequent approximation 
coincides with the preceding one (note that if the iterative process 
converges this aim is necessarily achieved because the results of 
the calculations are automatically rounded off). Then the machine 
will print the result and stop. It should be noted that if the iterative 
process does not converge either an overflow of locations will occur 
or the machine will go into a closed loop, that is repeat a certain 
sequence of instructions over and over again. In the latter case the 
machine cannot stop on its own, and we must stop it by pressing 
the stop key on the keyboard. 

Programs for solving more complicated problems can be very 
extensive. But they often include some simpler problems, for in- 
stance, such as the problem of computing the value of the sine for 
a given value of the argument entering into the main problem and 
so on. These simpler problems are encountered very often and it is 
therefore expedient to have special subroutines for solving them. 
These subroutines are composed beforehand and stored in certain 
locations in the external or internal memory. When a problem of 
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this kind is encountered in solving a more complicated problem" a 
special instruction is introduced into the program which makes 
the control unit take the corresponding subroutine out of the memory. 

To facilitate programming, several programming languages for 
communicating with computers have been recently developed. 

A program can be written in these languages so that all the necessary 
procedures can be readily and accurately expressed in terms which 
are closer to the language of mathematics than those described above. 

A program written in such a language is independent of the type of 
the machine we use and is printed by means of a device resembling 
an ordinary typewriter. Then the program is automatically trans- 
lated into the machine code by means of a special processor. Among 
these languages we mention the FORTRAN language (FORmula 
TRANslating system) and the ALGOL 60 (ALGOrithmic Language 
developed in 1960). The introduction of the languages in practice 
will make the application of computers available for many scien- 
tists and engineers. By the way, in some cases an experienced pro- 
grammer can compose a program in the machine code more econo- 
mically without using a computer language because he takes into 
account some peculiarities of the concrete machine. This is especial- 
ly important when we have to economize machine time. 

The errors occurring in a calculation process can appear because 
of the mistakes in composing or punching’ the program or due to mal- 
functions of the machine itself. The latter can be systematic (for 
instance, when some elements get out of order) or random (when an 
element passes from one state to the other on its own, at random). 
Systematic errors are checked by means of built-in checks or sup- 
plementary programmed checks based on solving certain problems 
with the answers known in advance. The correctness of composing 
a program which is to be introduced into the machine is checked 
before the machine is started. When checking a program we usually 
make the machine execute some parts of the program and try to 
roughly estimate certain intermediate results which is very impor- 
tant because these results must not exceed the capacity of locations 
of the memory (see Sec. 4). To check the punching we usually repe- 
atedly punch the program and cause the machine to compare auto- 
matically the corresponding cards from the two decks with one ano- 
ther. To detect, random malfunctions we can apply some well-known 
arithmetical rules used for checking the results of calculations. In 
more important cases the result is computed repeatedly in order' 
to compare the answers. Besides, computer designers try to elimi- 
nate the possibility of random malfunctions by perfecting the ma- 
chine in the process of designing, manufacturing and modifying 
its components. 
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Equations of Mathematical 
Physics 


Equations of mathematical physics are mainly partial differential 
equations describing various physical processes. The theory of these 
equations is an important division of mathematics with many ap- 
plications to physics, engineering and other branches of science. 
Here we shall present some elementary facts of the theory. 

§ 1. Classical Equations of Mathematical Physics 

1. Derivation of Some Equations. Let us consider the process of 
longitudinal vibrations of an elastic rectilinear homogeneous bar. 
We shall draw the £-axis along the bar and denote by x the coor- 
dinate of the corresponding point of the bar in the state of equilib- 
rium when the bar is unloaded. Let u = u (x, t) be the longitudinal 
displacement of the point x at moment t. In investigating the pheno- 
menon we shall assume the hypothesis of plane sections, that is we 
shall suppose that the plane sections of the bar move in the process 
of vibration in such a way that they remain parallel to their initial 
positions all the time (practically this means that the deviations 
from this condition are inessential and can be disregarded). 

The function u ( x , t) determines the law of vibrations of the bar. 
It should be noted that the term “vibrations” is understood in a 
conditional sense here because the process of deformation of the bar 
is an oscillatory motion only in certain simpler cases whereas in 
the general case the motion can be of a more complicated nature. 

In a vibration process there appear elastic stresses in the cross 
sections of the bar. We shall suppose that the corresponding defor- 
mations lie within the limits of applicability of well-known Hooke’s 
law (discovered in 1660 by the English mathematician and inventor 
R. Hooke, 1635-1703). The law states that unless a certain limit 
is exceeded the normal stress o is directly proportional to the longi- 
tudinal elongation e, i.e. a = Ee where E is Young’s modulus cha- 
racterizing the properties of the material. Therefore the stress can 
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be readily expressed in terms of the function u because the longi- 

, . , , , d x u du 

tudinal elongation of an element dx of the bar is equal to — gx 


(see Fig. 352) 


and thus we have 


_ p du 

(7 — h/ — 
ox 


(i) 


Now we can write down the equation of motion of the element dx 
of the bar. Let us suppose, for generality, that the bar is subjected 



, u+d x u. 






f_ [ ; h 1 1 >- 

0 x x*ax i & 

Fig. 352 

element dx in free equilibrium state 

the same element in the process of vibrations 


to a longitudinal distributed load of intensity p — p (x, t). Let F 
denote the constant cross-section area of the bar. Then the resultant 
force acting upon the element is equal to 

[oF-f 0 ,(ctF)] — aF-\-pdx = F^dx-+-pdx= (FE-^ + p} dx 

Denoting the density of the material of the bar by p we write, on 
the basis of Newton’s second law, the equation 


( FE lSz+ p ) dx = P Fdx 


d-u 


dx 2 1 r ) v* — g t 2 

Finally, dividing by p F dx and introducing the notation 


■/ 


E 

— = fl, 
P 


P (x, t) 


P F 


= f(x, t) 


we derive the equation 


d-u „ d 2 u 

~dfi~ a 'lhF 


■f(x, t) 


(2) 


where / (x, t) is the given function and u = u (x, t ) is the sought- 
for one. 

Let us derive another important equation, namely, the heat equa- 
tion describing the process of propagation of beat in a medium. 
We shall restrict ourselves to the case of a homogeneous isotropic 
medium. Let us denote by u = u ( x , y. z, t ) the temperature at 
the point {x, y, z ) of the medium at moment t. In deducing the equa- 
tion which must be satisfied by the function u we shall apply the 
'Coulomb law which states that at each point of a medium the thermal 
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energy is transferred in the direction of — grad u (which means 
that the heat propagates from the regions of higher temperature to 
those of lower temperature) and that the speed of the propagation 
is proportional to | grad u |. This law holds with a sufficient accu- 
racy when the variations of the temperature are comparatively 
small. To put down the mathematical expression of the law we take 
into account that it means that the quantity of heat passing through 
a surface element ( dS ) during the time period dt is equal to 

dQ — k grad n u dS dt 


where k is the coefficient of heat conduction and n is the outer unit 
normal vector to the area. Then the total quantity of heat passing 
during time dt into the interior of a solid (Q) with boundary surface 
( S ) is equal to 

k <^> grad n udS dt — k 

(S) (S) 


® grad u-dS dt 


(compare this with Sec. XVL22). Applying Ostrogradsky’s formu- 
la (see Sec. XVI. 23) and considering the volume (Q) to be small we 
obtain (compare with Sec. XVII. 22), to within infinitesimals of 
higher order, the relation 


-*I( 

(E2) 


dQ — k | div grad u dQ. dt — 

( 2 ) 

-+w+^)<® d ‘= k (^- 


d 2 u 


dx 2 


dy* r dz~ 


)Qdt 


Further we shall write (dfi) instead of (Q). 

Now we proceed to write the equation of heat balance for the 
element of volume {dQ). Let us suppose, for generality, that there 
are sources of thermal energy in the medium distributed with densi- 
ty g = g ( x , y, z, t). The quantity of heat in the volume {dQ) is 
equal to cpu dQ where c is the specific heat and p is the density of 
the substance. It follows that 


3, (cpu dQ) = k (-g- + ~+ U) dQ dt + qdQ dt 

Cancelling out cp dQ dt in both sides and introducing the notation 
^ = a {a is the coefficient of temperature conductivity) and = / 
we arrive at the heat equation 


du / d 2 u , d 2 u . d 2 u \ 

~df~ a \ dx 2 "f dy* 'Hi? j 


+ f ( x , y, z, t) 


( 3 ) 


where the function / (x, y, z, t) is given and u = u (x, y, z, t) is 
the sought-for function. 
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If there are no sources of thermal energy in the part of space under 
consideration equation (3) takes the simplified form 



(4) 


2. Some Other Equations. It turns out that various physical pro- 
cesses are described by means of similar differential equations. 

For instance, equation (2) describes not only the process of longi- 
tudinal vibrations of a bar but also the process of small transverse 
oscillations of a taut string (in this case u denotes the transverse 
deflection; see Sec. XVII.31), the process of longitudinal oscilla- 
tions of a gas in a tube (in this case u is the pressure or the density), 
the oscillations of an electric current in a 'wire with distributed 
resistance and inductance when there are no energy losses etc. 

The equation 


d-u 2 d~u 
Hp~ =Cl Hifl 


(5) 


is of particular importance. It describes free one-dimensional oscil- 
lations of a homogeneous medium, whereas equation (2) describes 
forced oscillations. In the case of a non-homogene ous medium the 
equation becomes more complicated. 

These equations should be accordingly changed in the case of two- 
dimensional and three-dimensional vibration processes. For instance, 
the equation of vibrations of a homogeneous isotropic three-di- 
mensional medium is of the form 


d*u 
dt a 


=“ ! ( 



[_ d ~ u 

L d * U 

\ 

\ Ox 1 

h dif- H 

1 dz~ 

) 


1/, 2, t) 


(6) 


in the case of forced oscillations and of the form 


d'-u 

L d2u J 


3x2 1 

3y2 U 

d;2 j 


in the case of free oscillations. These equations are also satisfied 
by the projections of the vectors of electric and magnetic field 
intensities, by the projections of the displacement vector of elastic 
vibrations of a body and so on. 

Equations (3) and (4) are of another type. We have seen that 
these equations are satisfied by the temperature in the process of 
heat transfer in an isotropic homogeneous medium. It can be shown 
that the same equations describe the diffusion processes if u denotes 
the density of a diffusing substance. Equations (3) and (4) can also 
be considered in a plane or on a straight line. For instance, in the 
latter case the equations take, respectively, the forms 
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and 


dll d 2 U 

dt a dx 2 


( 8 ) 


If the process in question is stationary both the sought-for func- 
tion u and the function / describing an external action must be in- 
dependent of t. Then from (6) we obtain the equation 


d z u . d~u 

dz 2 ' dyf 2 dz 2 



f(x, y, z) = fi (x, y, z ) 


( 9 ) 


which is referred to as Poisson’s equation. When, in a part of space, 
there are no external actions, we obtain the equation 


d 2 u . d 2 u . d 2 u q 

[dx 2 dy 2 ' dz 2 


( 10 ) 


called Laplace’s equation. Equations (3) and (4) yield the same equa- 
tions (9) and (10) in the stationary case. 

Here we have put down only those equations of mathematical 
physics which are most thoroughly studied. There are many other 
equations describing various phenomena. 

3. Initial and Boundary Conditions. We showed in Sec. XV.2 
that every ordinary differential equation possesses infinitely many 
solutions and that to isolate a concrete solution we need some sub- 
sidiary conditions. The same is true for partial differential equations 
for which we usually set initial or boundary conditions or both as 
the subsidiary conditions. 

We now return to the problem of longitudinal vibrations of a bar 
(see Sec. 1). It seems natural that to obtain a concrete process of 
vibrations we must set the initial conditions specifying the initial 
state of the bar. From the point of view of physics the initial state 
is completely determined by the corresponding initial displacements 
and initial velocities of the points of the bar [compare with condi- 
tions (XV. 10)]. Thus, for equation (2), we write the initial conditions 


u 1 = <P (x) 


and 


du 

dt 


<= 0 


= ^ (•*) 


(ii) 


where the functions <p and ip are given [compare with formulas 
(XVII. 133)]. 

If we investigate a portion of the bar which is placed so far from 
its ends that their influence is inessential during the time period 
in question (in other words, if it is possible to regard the bar as 
being infinite) it is sufficient to know only initial conditions (11) 
in order to determine the solution. In this case we arrive at a pro- 
blem with initial conditions alone, i.e. Cauchy’s problem. The ini- 
tial conditions and Cauchy’s problem are of a similar type for equa- 
tion (6) in a plane or in space [equation (6) is referred to as the wave 
equation]. The initial conditions for heat equation (3) or (7) difler 
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from those of (6) because the physical meaning of the problem in- 
dicates that to specify uniquely a process of heat transfer it is suf- 
ficient to specify only the initial distribution of temperature and 
therefore it is only the initial values of u that should be set for t = 0. 

If the influence of the ends of the bar cannot be neglected then 
besides the initial conditions we must set certain boundary conditions 
which describe the processes on the boundary of the medium in 
question, i.e. at the ends of the bar in our case. Boundary conditions 
can be of different forms depending on the processes, and they should 
be set for both ends (x = 0 and x — l) independently. For instance, 
let us consider the left end. 

It can be rigidly fixed and then we have 

u | x=0 = 0 (i.e. u (0, t) = 0) . (12) 

A more general condition can describe the motion of the left end 
according to a given law of the form 

tt|*-o = xW ( 13 ) 

where the function % ( t ) is given. Condition (13), and its special 
case (12), is called a condition of the first kind. 

If the left end is free the condition must express the fact that there 
is no normal stress there. Thus, by (1), we have 


du 

dx 


3C=0 


= 0 


This condition [and also a more general condition 


du 

dx 


X = 0 


- x Ml 


is called a condition of the second -kind. 

In the case of an elastic fixing of the left end the normal stress 
in the end point section is proportional to its displacement, that 
is we have a = ku at the end-point. From this, on the basis of for- 
mula (1), we deduce the condition 

(“-r) < 14 > 


du k \ 

f du 

l dx E U ) 

= ( -3 a u 

x=0 \ ox 


which is referred to as a condition of the third kind; the same term 
is applied to the corresponding non-homogeneous condition. (Let 
the reader verify that in the case of the right end the similar condi- 
tion contains + instead of — in front of a.) 

Thus, if, for instance, the left end of the bar is rigidly fixed and 
the right end is free the boundary conditions are of the form 


U |x=0 — 0, 


du 

dx 


x=l 


= 0 


The problem of solving a partial differential equation when both 
initial and boundary conditions are given is referred to as a mixed 
problem (boundary-initial-value problem). 

50-0141 
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If an equation is solved in a three- or two-dimensional domain 
a boundary condition of the first kind reduces to specifying the 
values of the sought-for function on the boundary of the domain 
and a condition of the second kind consists in setting the values 
of the derivative of the function along the normal to the boundary. 

Thus, if the influence of the boundary of the domain is essential 
the equation of the non-stationary process in question must be solved 
for given initial and boundary conditions. It is apparent that in 
the case of the equation of a stationary process [for instance, equation 
(9) or (10)1 we set only boundary conditions. In the latter case, for 
a condition of the first ldnd, the problem is called the Dirichlet 
problem or the first boundary value problem (after the German ma- 
thematician P. Dirichlet, 1805-1859). Accordingly, for a condition 
of the second kind it is called the Neumann problem or the second 
boundary value problem after the German mathematician K. Neu- 
mann (1832-1925) who for the first time systematically investiga- 
ted the problem in 1877. 

§ 2. Method of Separation of Variables 

There are many methods of solving equations of mathematical 
physics. Here we shall discuss one of the most important methods 
referred to as the method of separation of variables. 

4. Basic Example. Let us consider the problem of solving the 
equation 

•|r “**75!- 0<*<«>) (15) 

for the simplest boundary conditions 

u |x = o =0, U |x = ; = 0 (0<f<oo) (16) 

and initial conditions 

u |t= 0 = 9 ( x )> = (17) 

where q> and i[) are some given functions,* 

We shall interpret the function u = u (x, f) as the transverse 
deflection of a taut vibrating unloaded string (see Sec. XVII. 31) 
fixed at the ends (x = 0 and x — l) and the functions <p and in 
conditions (17) as an initial deflection and an initial velocity. We 
can, of course, interpret u ( x , t) as a longitudinal displacement 
of the point x of a bar and the like. 

* In Sec. XVII. 31 we solved this problem [see equation (121) and condi- 

tions (132) and (133)] by means of expansions in Fourier series. The method 
of separation of variables, as we shall see, reduces to Fourier expansions and 
yields the same result [see formulas (25) and (XVII. 135)]. — Tr. 
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The main idea of the method of separation of variables is that 
we seek for a solution of equation (15) of the special form 


u (x, t) = X (x) T (t) 


(18) 


under boundary conditions (16) (but without any initial conditions). 
Hence, we are interested in a solution which is the product of a 
function dependent only on x bj’- a function dependent only on t. 
If we want to investigate the form of the string corresponding to 
solution (18) at the subsequent 
moments t t , t 2 , t 3 , ... we must 
multiply the fixed function X ( x ) 
by the constant factors T (t{), 

T (t 2 ) , T (t 3 ), . . . . Therefore the 
zeros of the function X (x) re- 
main all the time the zeros of 
the function u (x, t) (and are 
referred to as the nodes). Accor- 
dingly, the points of extrema 
of the function X (x) remain 
the points of extrema of u (x. t) 

(these are called the antinodes). The form of the vibrating string at 
the successive moments of time is depicted in Fig. 353. A vibratio- 
nal state of peculiar type (18) is called a standing wave. The fun- 
ction X (x) describes the form of the standing wave and T ( t ) 
expresses the law of its variation in time. Hence, we have posed 
the problem of finding the standing waves which are possible for 
the given boundary conditions. 

The substitution of (18) into (15) results in 



Fig. 353 


X(x)T"(t) = a*X"(x)T(t), i.e. 


T"(i) X" (x) 
a<l T (1) “ X{x) 


The left-hand side being independent of x and the right-hand side 
being independent of t, the last equality can be fulfilled if and only 
if both sides contain neither x nor t. Hence, both sides are equal to 
the same constant which we denote by ~X (but "k itself can be of 
an arbitrary sign): 

T n {t) X"{x) 
a-T it) — X[x) ~ A 

It follows that T (t) and X (x) must satisfy the corresponding' ordi^ 
nary differential equations 


T" {t) + (t) = 0 

and 

X" (x) + XX (®) = 0 

Thus, the independent variables have been separated! 


( 19 ^ 

( 20 ) 


50 * 


788 


INTRODUCTORY MATHEMATICS FOR ENGINEERS 


Now recall that we have imposed conditions (16) on solution (18). 
The first condition (16) yields X (0) T (t) == 0 which implies X (0) = 
= 0 [if X (0) Tjk 0 we must have T (t) = 0 and then u ( x , t) = 0 
which does not yield a standing wave]. We similarly consider the 
second condition (16) and thus arrive at the boundary conditions 
for a standing wave: 

X (0) =0, X (l) = 0 (21) 

Consequently, to find all the possible forms of standing waves we 
must solve equation (20) with boundary conditions (21) (a similar 
problem was treated in Sec. XV. 16 but here we shall not use the 
results obtained there). It is clear that the function X = 0 satisfies 
both equation (20) and conditions (21) for any X but we are not in- 
terested in the function since it does not yield a standing wave. 
Thus, the only solutions we are interested in are those satisfying 
the condition X ( x ) ^ 0. Such solutions, generally speakiDg, may 
not exist for all X. The values of X for which these solutions exist 
are called the eigenvalues of problem (20) with conditions (21). 
The solutions X ( x ) are called the eigenfunctions of the problem 
corresponding to these eigenvalues. 

We first suppose that X < 0, i.e. X — — v 2 . Then equation (20) 
has the general solution 

X (x) = C,e vx + C 2 e~ vx 

(check it up!), and conditions (21) imply that 

C\ + C 2 = 0 and C^ 1 + C' 2 e~’ w = 0 

It follows that C 2 — — Ci and therefore 

Ci** 1 — C t e~* 1 = 0, i.e. C, (e 2 * — 1) e~» l = 0 

T3ut the second and the third factors are different from zero (why?). 
Hence C\ = 0 and consequently C 2 — 0 and X (x) = 0. Thus, 
there are no negative eigenvalues of this problem. Let the reader 
prove that the value X = 0 is not an eigenvalue either. 

Now let X — k 2 >0. Then equation (20) has the general solution 

X (x) = C i cos kx + C 2 sin kx (22) 

(check it up!). Conditions (21) imply that 

Ci = 0 and C 2 sin kl = 0 (23) 

But we must have C 2 =?£= 0 (why?) and hence sin kl = 0. From this 
we deduce 


kl = nn and 


k = ~ {n = 1,2,3,...) 



appendix 


789 


Thus, the problem has the following infinite set (spectrum) of eigen- 
values: 

i 2 


k^k 


,= (™) (» = 1,2, 3, ...) 


The corresponding eigenfunctions are implied by (22): 

nir.r 


X n (x) = sin 


l 


(n = 1, 2, . . .) 


(24) 


We have not put down the factor here because any constant fac- 
tor can he included into T ( t ) 
in the expression (18). 

Substituting the value k = 

= k n {n — 1, 2, 3, . . .) thus 
found into equation (18) we get 

T (t) = A cos —j— t + B sin —j— t 

where A and B are arbitrary 
constants. Thus, we have obtai- 
ned harmonic vibrations with fre- 
quency ® n = — j - . Consequently, 

according to (18), the sough t-for standing waves are of the form 



/ #\ l a . . t-) ■ (ITLTC ,\ • flJtX / /to \ 

u (x, t) = ( A cos— — t sm — ^ — n sin-p (tz= 1, 2, ...) 

(The first three standing waves corresponding to n = 1, 2, 3 are 
show in Fig. 354.) The frequencies of vibrations of these waves 
are equal to 


©i 


an 

T’ 


2an n 3 geji r, 

©2 = — i — = 2 ®!, © 3 = — j — = 3 ®!, ... 


As it is said in acoustics, the first standing wave corresponds to 
the fundamental tone and the subsequent standing waves wliose 
frequencies are 2, 3, 4 etc. times the frequency of the fundamental 
tone determine the overtones (see Sec. XVII. 23). 

Let us now construct the general solution of equation (15) with 
boundary conditions (16) which must describe the general form 
of vibrations that are possible under the given boundary conditions. 
This general solution is obtained as a combination ( superposition ) 
of the above standing waves with different amplitudes, i.e. it has 
the form 

oo 

«(*, + (25)' 

n— 1 
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-where A n and B„ (n — 1, 2, 3, . . .) are arbitrary constants. Equa- 
tion (15) and boundary conditions (16) being linear and homogene- 
ous, the whole sum (25) satisfies them because each summand satis- 
fies the equation and the conditions (compare with property 1 in 
Sec. XV.14). To prove that formula (25) represents the general solu- 
tion we must show that the arbitrary constants A n and B n can be 
so chosen that any initial conditions (17) should be satisfied. Note 
that in contrast to an ordinary differential equation whose general 
solution contains a finite number of arbitrary constants the general 
solution of problem (15), (16) depends on two infinite sequences of 
arbitrary constants or, which is the same, on two arbitrary functions, 
namely, the functions qp ( x ) and i|) ( x ) entering into conditions (17). 

To satisfy the first condition (17) we put i — 0 in (25) which yields 

CO 

cp (x) = ^ A n sin-^^- (0 (26) 

77 = 1 

Thus, the given function q> (z) must be expanded into a series in 
eigenfunctions of problem (20), (21). In this particular case this is 
nothing but an expansion into a Fourier series of form (XVII. 107). 
As it was shown in Sec. XVII.22, expansion (26) is possible, and 
the coefficients of the expansion are found on the basis of the ortho- 
gonality condition: 


y j q> (x) sin dx (27) 

o 


To satisfy the second condition (17) we differentiate both sides of 
equality (25) with respect to t and then put t = 0: 

oo 

■ / \ \ ^ j-) drill • mix 

^ (x) = 2 j Bn ~r- sin ~n 

n=i 

After a manner of (27) we obtain the relation 

i 

Bn = ^i - j tM*) ( 28 ) 

o 

(check up the calculations!). 

Thus, the solution of the original problem (15)-(17) is given by 
formula (25) in which the coefficients are defined by formulas (27) 
and (28). 


\ (p (x) sin — — 


dx 


A r . 


\ sin2j T 


dx 
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5. Some Other Problems. 

1. Suppose that instead of equation (15) we consider homogeneous 
heat equation (8) with the same simplest boundary conditions (16) 
and with initial conditions of the form 

u | f=o = <p [x) (0 ^ x ^ Z) 

(see Sec. 3). After substitution (18) has been performed and the 
variables have been separated we arrive at the same problem (20), 
(21) whose solution yields the eigenfunctions and the eigenvalues 
(let the reader perform the calculations!). But instead of (19) we 
obtain the equation 

r (t) + kr (t) = o 

for the function T (t). It follows that 

T ( t ) = Ae-* at 

Therefore we obtain the formula 


00 anW 

u(x, t)= ^ A * e 12 

n=l 


sin 


r ra tx 
~ 


which expresses the general solution of the equation under the given 
boundary conditions. This formula substitutes for expression (25) 
in this case. The coefficients A n can be found here by means of the 
same formula (27). 

We see that the set of the eigenfunctions and eigenvalues remains 
the same as before in this problem but the dependence on time is 
of a different type here. Instead of harmonic functions of . which 
we had in Sec. 4 we obtain exponentially damping functions in this 
case (which results in lim u — 0). By the way, the physical mean- 

f -V oo 

ing of the problem in question implies that the function u ( x , i ) 
must behave in this peculiar way as t increases (why?). 

Now take the Laplace equation 


with the 


d 2 u , d 2 u 
dx 2 dy~ 

conditions 


(0<a:<Z, 


0<i/-<m) 


u I.t=o = 0, u = 0 (0 ^ y ^ m) 

Let the reader verify that the same method yields the general solu- 
tion of the form 

°° / rc.-ij/ nny \ 

u{x, y)= 2 [A n e * +B n e~ 1 j sin ~- 

n= 1 ' / 

for this problem. The constants A n and B n can be specified if some 
subsidiary boundary conditions for y — 0 and y — m are given 
(t len we substitute y = 0 and y = m into the above solution etc.). 
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It should be noted that such a separation of variables is by far 
not always possible for all the partial differential equations. 

2. We now turn back to equation (15). Suppose that instead of 
boundary conditions (16) we have conditions of another type. For 
instance, let us take the conditions 

u |x=o — 0, -^1^ = 0 

This results in a change of conditions (21) specifying the eigenfunc- 
tions. The new conditions will be of the form 


X (0) = 0, X' (Z) = 0 
Then instead of (23) we arrive at the relations 
C i = 0, CJc cos kl = 0 

which imply 

kl= — (n—l, 2, ...) 


Therefore the eigenvalues and the eigenfunctions of this problem 
are expressed by the formulas 


K = -jr ( — T+ nn Y' X n(x) = sin [(— x+ WIX ) t] 

(n = l, 2, 3, ...) 


Hence, the spectrum of eigenvalues (together with the spectrum of 
frequencies) and the set of eigenfunctions have changed. The first 
four eigenfunctions are depicted in Fig. 355. It is easy to directly 
verify that the eigenfunctions are orthogonal to one another. It 
can be proved that the eigenfunctions of the problems of this kind 
always form a complete system of functions. Therefore the solution 
of such problems can be completed by means of techniques similar 
to those given in Sec. 4. 

We often deal with more complicated equations when finding 
eigenfunctions of a problem. For instance, take the boundary con- 
ditions of the form 


u |*=o = 0, -ecu) 


0 


[see formula (14)]. Then instead of (23) we obtain 
Ci = 0, Ci (k cos kl -}- a sin kl) = 0 
which implies tan kl = — . Let us denote kl by p. The equation 

for determining p is of the form tan p = — . A method of graphi- 

cal solution of the equation is illustrated in Fig. 356. The graphs 
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clearly indicate that there is an infinite sequence of positive solu- 
tions p, < fi 2 < Us < • • This implies 

= X n (*) = sin (ti = 1,2,...) 

In this case the eigenfunctions also form a complete orthogonal 
system of functions. 

3. Let us consider equation (2) describing the process of forced 
oscillations of a string under the simplest boundary conditions (16) 
and initial conditions (17). (By the way, the problem can be treated 
similarly in the case of boundary conditions of other types.) Here 



the first stage is to determine the set of the eigenfunctions of the cor- 
responding homogeneous problem. This was performed in Sec. 4 
and we can therefore appty the results obtained there. After that 
we expand the functions u ( x , ) and / ( x , ) into series in the eigen- 
functions for any fixed value of and obtain 


and 


u (x, «)= ^ ( t ) sin 

n = 1 


nnx 

~T~ 


(29) 


f(x, t)=% H n (t) sin-^ (30) 

n= 1 

The coefficients ff n ( t ) are immediately found as the Fourier coef- 
ficients of the given function / ( x , t ): 

i 

[ f(x, t)sin^^-dx 
o 

But the coefficients T n (t) are unknown here; they are the sought- 
Jor quantities of our problem. 
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Substituting (29) and (30) into equation (2) we obtain 

n=l n= 1 

+ 2 (0 sin-^- 

n~ 1 

Now, equating the coefficients in similar eigenfunctions we find 

Tn + (~y“) 2 T n — H n (t)' (0 <f< TO ) (31) 

Besides, if we substitute t — 0 into (29) then, according to initial 
conditions (17), its left-hand side must be equal to tp (x). Consequent- 
ly, the values T n (0) of the functions T n ( t ) are equal to the coeffi- 
cients of the expansion of the given function tp ( x ) into a series in 
the eigenfunctions. This enables us to easily find these values: 

i 

T„ (0) = 4 j tp (x) sin ~ dx (32) 

0 

Similarly, differentiating equality (22) with respect to and sub- 
stituting i — 0 after that, we obtain 

1 

T' n (0) = • -f [ * (a?) sin ~ dx (33) 

o 

Thus, to find T n (t) we must take initial conditions (32), (33) and 
solve the ordinary non-homogeneous linear differential equation 
with constant coefficients (31). The solution of the equation can be 
easily found by means of the method of variation of arbitrary con- 
stants [see Sec. XV.15 and, in particular, the solution of equation 
(XV.87)]. Substituting the coefficients thus found into (29) we ob- 
tain the sought-for solution. 

4. Let us now take a problem in which not only the equation but 
also the boundary conditions are non-homogeneous. For instance, 
let the problem be of the form 

•W- = fl2 -S - + /(*. *) 0<;< oo) (34) 

w|*=o = Xi(0» u |*=i = 7,2 (0 (0<*<°o) (35) 

H| 1= o = q>(x), | f=o ==!!)(*) (0 <*<Z) (36) 

This problem can be reduced to a problem of the type considered 
above. For this purpose we take an arbitrary function g (x, t) satis- 
fying the boundary conditions, i.e. conditions (35). For example, 
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we can put g ( x , t ) = y,i ( t ) -f- -j Vfc ( t ) — Xi (01 01 choose g ( x . t) 

in any other way. Then we replace the sought-for function by means 
of the formula 

u {x, t) = g ( x , t) + U [x, t ) (37) 

where U ( x , t) is the new unknown function. To derive the diffe- 
rential equation and the corresponding subsidiary conditions for 
U (x, t ) we must substitute expression (37) into all the equalities 
(34)-(36). This results in 


o-g 


o"g 1 

a/2 J 


d-U 


as u „ s"-u , r , , 

a/2 ~ a ~ ex- 0 ■ a ' dx n - 

U |x=0 — Xt (0 — g |x=0 — 0; U\ X ; 
o l(=o = [9 (*) ■ -e |_o] = ® (x), 1 = [* (x) - ff | te0 ] = V (x) 


; a ~ Q x 2 ”f~ F (x, t) 

=/ = X 2 (0 —g |i=i = 0 (38) 


where F , CD and T designate the expressions in the square brackets 
(let the reader perform all the calculations). When deducing equa- 
lities (38) we have taken into account that the function g (x, t ) 
satisfies boundary conditions (35). Thus, to find U (x, t) we must 
now solve a problem which is completely analogous to the problem 
solved above, the only difference between them lying in the notation 
and in the particular form of the functions /, cp and ij). After the 
function U has been determined we obtain the solution of the ori- 
ginal problem by means of formula (37). 
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IfuaHtyof mixed partial derivatives 

304 

Equation 

algebraic 1/9 

general* of a curve of the second 
order 105-108 
integral 562 
of a curve 82 

in polar coordinates 8*-86 
of an algebraic surface of the 
second order 327 
of a surface 313 

of longitudinal vibration of a bar 
781 

of oscillations of a string 712, 783 
polar, of a curve of the second 
order 103 

transcendental 1/9 
Equations of mathematical physics 
7805 ' 

Equilibrium state oo9 
Error 32 

function 448 
limiting 32 
maximum absolute 34 
maximum relative 33 
true 32 


Euclidean 
basis 247 
space 244, 739 

infinite- dimensional 707 

Euler’s 

broken line 579 
constant 650 
differential equation 548 
formulas 265, 266, 399 
integral of the first kind (beta 
function) 471 

integral of the second kind 
(gamma function) 469 
method (for difierential equations) 
578, 579 

theorem (on homogeneous func- 
tions) 300 
E volute 255, 373 
Evolvent (involute) 255, 373 
Expectation of a random variable 741 
Exponential form of a complex num- 
ber 266 

Exponential integral 449 
Extrapolation 45, 61, 19/, 198, 288 
Extremum 167 
absolute 168 
conditional 384 
cuspidal 169 . 

of a function of several variables 
375 

point of 166, 167 

relative 168 

unconditional 384 

vrith unilateral constraints 388 

Factorization of a polynomial 271-273 
Favourable outcome^ (case) 723 
Fermat’s principle 172 
Fibonacci numbers 680 
Field 293 . 

centrally symmetric ob9, bdi 

non-stationary 293 

plane 294 

plane-parallel 294 

scalar 293 

stationary 293 

vector 293, 6265 

velocity 525 

Finite difference 192-196 
central 195 
divided 193, 304 
central 195 
mixed 304 
partial 303, 304 

First integral of a difierential 
equation 5265 
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Fixed point method of representing 
numbers in a computer 766 
Floating point method of representing 
numbers in a computer 766 
Flux of a vector field 627 
Focal point of a differential equation 
515 

Forced oscillations 532, 538, 547, 
783 

Formula of total probability (parti- 
tion formula) 730 

Formulas for probability of hypotheses 
(Bayes’ formula) 731 
FORTRAN 779 
Fonvard extrapolation 198 
Fourier 

integral 715 
series 690-706 

application of 711ff 
in complex form 702ff 
multiple 710, 711 
transform 715 

application of 719, 720 
cosine 716 
inverse 715 
properties of 717 
sine 716-719 

Free 

oscillations 497, 498, 544, 548 
vector 213 

Fresnel’s integrals 448 
Function 36-48 

absolutely integrable (summable) 
461, 465 
algebraic 52 
beta 471 

branch of 57, 291 

characteristic 748 

composite 42, 129 

continuous 49, 125 

cubic 53 

decreasing 49 

delta (Dirac) 488ff 

differentiable 151 

discontinuous 49 

domain of definition of 47, 48 

entire rational 52 

even 51 

exponential 53, 69, 70 
fractional-rational 52 
frequency 733 
gamma 468ff 

generalized 488-496, 621, 622 
generating 679 

graph of 45, 46, 54-56, 284 
homogeneous, of degree k 300. 


Function 

implicit 56-58, 291 

of two variables 370 
increasing 49 

influence (Green’s) 492ff, 539, 
540, 622 
inverse 58-60 
irrational 52 
jump of 127 
linear 53, 60 
linear-fractional 66, 67 
logarithmic 53, 68 
methods of representing of 42-45 
monotonic 49 
multiple-valued 49 
odd 51 

of an arbitrary number of argu- 
ments 291 

of a function (composite) 42, 
129 

of a point 293 

of a random variable 739-741 
of matrices 681 ff 
of three variables 292 
of two variables 282 

domain of definition of 286, 
287 

methods of representing of 
282-286 

particular value of 40, 41 
periodic 50, 51 
period of 51 
piecewise smooth 131 
point of discontinuity of 49, 126, 
127 

power 53, 63-66 

primitive (antiderivative) of 393 

quadratic 53, 62 

range of 48 

rational 52 

regression 747 

single-valued 48 

smooth 131 

square-summable (integrable) 465, 
707 

step 491 

the greatest value of 168-170, 
389, 390 

the least value of 168, 389, 390 
transcendental 53 
zero of 133 
zeta 650 

Functional 428, 536 
analysis 244 
linear 428 
relation 39 
series 6613 
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Functions 

algebraic 52 

algebraic classification of 51-53 
cylindrical (Bessel’s) 566 
dependent 363 
elementary 53 
basic 53 

derivatives of 142-146 
hyperbolic 79, 266 
independent 363 
inverse hyperbolic 71 
inverse trigonometric 53, 74 
linearly independent 530 
special 416 

trigonometric 53, 72, 73 
Fundamental 

frequency 533 
harmonic 696 

system of solutions of a homoge- 
neous linear differential equa- 
tion 530 

theorem of algebra 271 
tone 789 


Gamma function 468ff 
Gaussian 

(total) curvature 384 
(normal) law 736, 750 

applications of 750-756 
multidimensional 739 
Gauss’ method (elimination scheme) 
208, 209, 335 

Geometric interpretation of a system 
of first-order differential equations 
522ff 

Gradient 366 

Graph of a function 45, 54-56 
Graphical method of representing 
functions 44 

Greatest lower bound 170 
Greatest value of a function 131, 
168-170, 389, 390 
Green’s 

formula 639 

function 4922, 539, 540, 622 
Guldin’s 

first theorem 481 
second theorem 602 


Half-life of a radioactive element 507 
Harmonic analysis 696 
Harmonic oscillations 72, 73, 269 
Heat equation 782 

Heaviside unit function (step func- 
tion) 491 


High-speed electronic computer 757, 
762 

Hilbert space 707 
Hodograph 249 
Homogeneous function 300 
Hooke’s law 780 
Hyperbola 66, 99-102 

canonical equation of 100 
centre of 100 
conjugate axis of 100 
eccentricity of 103 
focuses of 100 
principal axes of 100 
transverse axis of 100 
vertices of 100 

Hyperbolic point of a surface 383 
Hyperboloid(s) 324-326 
of one sheet 324 
of revolution 324, 326 
of two sheets 325 
Hyperplane 243 
Hypocycloid 90 

Hypothesis of plane sections 780 

Image (under a mapping) 340 
Imaginary 
axis 260 

part of a complex number 259 
unit 259 

Improper integral 4542 

absolutely convergent 461, 617 
comparison test for convergence 
of 4592 

conditionally convergent 462 
convergent 455, 615 
dependent on a parameter 4762 
divergent 455, 615 
divergent to infinity 455 
multiple 6152 

dependent on a parameter 
6172 

oscillating divergent 457 
properties of 458-464 
Improper rational fraction 277 
Increment of a variable (function) 60, 
139, 140, 294, 296 
Independent equations 301 
Indeterminate forms 113, 116, 127, 
129, 130, 132, 158-160 
Index of summation 118 
Infinite interval 30 
Infinitesimals 109-112 

comparison of 121-122 
equivalent 121 

Initial conditions 499, 500, 524, 712, 
784 

Initial phase of a harmonic 72 
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Initial value problem 500, 784 
Input 

of a computer 763 
variable 758 

Instantaneous velocity 135, 250 
Instruction 763, 7695 
Integrable combination 520, 527 
Integral(s) 
cosine 449 
curve 499, 523 

curve of a diSerential equation 
499 

definite 420 

dependent on a parameter 474-478 
continuity with respect to the 
parameter of 475 
differentiation with respect 
to the parameter of 475, 496 
integration with respect to 
the parameter of 476 
double 587 

geometric meaning of 592 
exponential 449 
Fresnel’s 448 
improper 454, 615 
indefinite 394 
multiple 587 

applications of 5895 
of higher order 6225 
properties of 5875 
of a periodic function 430, 431 
over a manifold 6225 
over a plane figure 5995 
over a rectangle 5965 
proper 454 
sine 449 
Stieltjes 621 
sum 419, 586, 587, 597 
surface 6026 

test for convergence of a series 
647 

triple 587 
volume 6045 

with respect to a measure 620-622 
Integrand 394, 420 
Integrating 
factor 511 

functions by means of series 450, 
467, 468 
inequalities 433 
Integration 

elementary methods of 393-404 
element of 394, 420 
by change of variable (by substi- 
tution) 402, 428, 429 
by means of difierentiation 517, 
51S 


Integration 

ny parts 400, 428 
by quadratures 503 
limits of 420 
mechanical 450 __ 
numerical 450-454 
of irrational functions 407-412 
of rational functions 405-407 
of trigonometric functions 412-415 
Interpolation 45, 60, 182, 191, 196, 
287, 452 
Interval 30 

of constancy of a function 49 
of convergence of a power series 
667 

of integration 420 
of monotonicity of a function 
49, 165, 166 

Invariance of the form of the difie- 
rential 152, 299 
Invariant 92 

Inverse interpolation 198 . 

Inversion of order of integration 598 
Isocline 502 

Isolated singular point of a plane 
curve 372 
Isomorphism 

of Euclidean spaces 247 
of linear spaces 243 

Jacobian 302 

Joint distribution 7375 

Jump discontinuity 127, 289 

Lagrange’s 

difierential equation 517 
form of the remainder of Taylor’s 
series 165 

interpolation formula 191, 192, 
242 

method of undetermined multi- 
pliers 386 

method of variation of arbitrary 
constants 506, 531, 535, 554, 
555 

theorem (on finite increments) 187 
Lame’s coefficients 608, 611, 641 
Laplace’s equation 784, 791 
Law of large numbers 753 
Law of motion 87 
Law of refraction 173 
Least-square method 380, 381 
Least 

upper bound 170 
value of a function 131, 168-170, 
389, 390 

Lebesgue measure 620, 623, 642, 739 
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Left-hand screw rule 227 
Legendre’s polynomials 689 
Leibniz 

formula (for differentiation of 
an integral with respect to the 
parameter) 475 

rule (for differentiation of a pro- 
duct) 156 

test for convergence of a series 
650 

Lcmniscate 91 
Level 

lines 285, 371 
surfaces 292, 368 
L’Hospital’s rule 158, 159 
Limit(s) 113-117 
inferior 115 
infinite 115 
left-hand 126 

of a power-exponential expression 
131 

of integration (upper, lower) 420 
point 114 

properties of 115-117 
right-hand 126 
superior 115 
Linear 

algebra 238, 244 
approximation 287 
combination of vectors 216, 241 
differential equations 505, 506, 
528-535, 541-549, 553ff 
extrapolation 61, 288 
function 53, 60 

interpolation 61, 182, 287, 452 
law of elasticity 492, 498 
operator 493, 528, 55l 
space 237 

Linearization 151, 183 
Linearly dependent (independent) func- 
tions 530 

Linearly dependent (independent) vec- 
tors 217, 241 
Line integral 

of the first type 479 
of the second type 4822 
Localized (bound) vector 213 
Location (of memory of a computer) 
762 

Logarithmic 
scale 28 
spiral 85, 86 
Lower bound 30, 31 
Lyapunov stability 558ff 

Machine translation 769 
Maclaurin’s series 164 


Magnetic core, drum and tape 768, 
769 

Mapping 282, 339ff 
degenerate 361 
eigenvalue of 350 
eigenvector of 350 
identity 347 
into a space 340, 341 
inverse 344, 359 
isometric (orthogonal) 353 
linear 340, 34i 
matrix of 342 
nonlinear 358-362 
one-to-one 359 
onto a space 341 
Marginal distribution 738 
Mass-scale phenomena 722 
Mathematical physics 780 
Mathematical statistics 756 
Matrix (matrices) 329 
addition of 331 
characteristic equation of 335 
column 330 
complex 333 

degenerate (singular) 334 
determinant of 331 
diagonal 330 
eigenvalue of 335 
eigenvector of 335 
form of a system of linear diffe- 
rential equations 557 
inverse 333 
multiplication of 332 
non-degenerate (non-singular) 334 
of a linear mapping 342 
operations on 331-333 
order of 330 
orthogonal 352 
principal diagonal of 330 
rank of 337 
row 330 

screw symmetric (antisymmetric) 
331 

square 330 
symmetric 331 

properties of 3532 
transposed 330 
transposed conjugate 333 
unit 330 
zero 330 
Maximum 

of a function of one variable 167 
of a function of several variables 
3752 

Mean 

density 435 
deviation 661 
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Mean 

square deviation 661 
value 

of a function 434 ; 5S9 
of a random variable 741 
Measure 5S6 , 6205. 642, 739 
theorem 434 

Memory (storage) of a computer 762 
Method(s) 

Adams 5S2, 583 

approximate, for solving diSeren- 
tial equations 5626 
collocation 275, 280 
combined 1S3 
cut-and-try 1S1, 182 
decomposition (for integrals) 39S 
direct (for finding an extremum) 
3SS 

Euler’s (broken line) 57S. 579 
iterative 1S5-187 

for solving difierential equa- 
tions 5625 

for systems of linear equa- 
tions 20S. 209 
for systems of non-linear 
equations 391 
Lobachevsky’s 274 
Milne’s 5S3*, 584 
Newton’s 182 

for svstems of equations 
391 ' 

numerical, for solving difieren- 
tial equations 578-594 
of chords 1S2 
of elimination 20S 
of least squares 380, 381, 575 
of moments 575 
of parallel sections 322 
of separation of variables 7S65 
of steepest descent 3S7 
of tangents (Newton’s) 182 
of undetermined coefficients 27S5, 
545 

of variation of arbitrary constants 
(parameters) 506. 531. 535, 554, 
555 

Kunge-Kutta 580-582 
Seidel’s 391 
simplification 577 
small parameter (perturbation) 
1S9. 5696 
Minimax 377 
Minimum 

of a function of one argument 167 
Minor 

of a determinant 203 
of a matrix 337 


Mixed 

partial derivatives 304 
problem 785 

(triple) scalar product 235 
Mobius strip 624 
Modulus 

of a vector 212 
of elasticity (Young’s modulus) 
53S, 780' 

Moment(s) 

of a random variable 746 
of a vector about a point 233 
of inertia 53S, 596 
static 590, 596 

Monotonicity, intervals of 49, 165, 
166 

Multiple root of an equation 272 
Multiple-valued function 49 
Multiplication 

of approximate numbers 36-39 
of a vector by a scalar 215, 216 
of complex numbers 261 
of operators 550 
rule of probability theory 728 
Multiplier(s) 

Lagrange’s 386 
(unit of a computer) 759 
Multiply connected domain 487 


Natural (fundamental) frequency 533 
Natural (Napierian) logarithms 68 
71 -dimensional manifold (space) 310 
Necessary condition for convergence 
of a series 120 

Necessary conditions for an extremum 
167 

Negative of a vector 215. 23S 
Neighbourhood 31 
Neumann’s problem 786 
Newton-Leibniz theorem 427 
Newton’s 

interpolation formula 196-198. 
452 

law of gravitation 613, 619 
method (of tangents) 182 
Nodal point 90, 372. 515 
Node of a standing wave 787 
Nomography 285. 286 
Non-orientable surface 624 
Non-perturbed solution of a difieren- 
tial equation 573 

Non-restricting (one-sided) constraints 
_ 38S 

Non-trivial solution of a homogeneous 
system 335 
Norm of a vector 244 
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Normal acceleration 252 
Normal form of a system of differen- 
tial equations 522, 524 
Normal (Gaussian) law 736, 739, 750 
Normal plane (to a curve) 251 
Normal section of a surface 381 
principal 382 
Normal to a curve 137 
Normalization 244, 689, 751 
factor 739 
Number 

complex 259 

conjugate 263 
e 68, 124, 164, 509 
imaginary 260 
pure imaginary 260 
real 260 
scale 27 
system 

binary 764 
binary-decimal 765 
octal 765 
vector 330 
Numerical 

characteristics of random vari- 
ables 741-748 
integration 450-454, 649 
solution 

of algebraic equations 273- 
276 

of differential equations 578- 
594 


One-parameter family of curves 372 

Open interval 30 

Operation(s) 

commuting 347 
non-commuting 347 
Opera tor(s) 342, 493 
commuting 550 
difference 550 
differential 528', 549, 550 
equation 552 
linear 493, 528, 551 
method of solving differential 
equations 552, 553 
non-commuting 550 
non-linear 551 
of differentiation 473 
of multiplication by a number 
(by a given function) 550 
power series of 551 
powers of 551 
shift 550 
unit 550 
zero 550 


Optical properties of conic sections 147, 
148 

Order of an algebraic curve 90 
Orders of smallness 124 
Orientable 

manifold 623 
surface 624 

Orientation of a surface 227 
Orthogonality 

of functions 686, 687, 703 
of vectors 224, 246 
with weight function 708 
Orthogonalization 247, 689 
Orthogonal polynomials 688 
Orthogonal vectors 221 
Oscillating divergent series 652 
Oscillations 

damped 66, 109, 114, 544 
forced 532, 547, 548, 783 
free 497, 544, 548, 783 
harmonic 72, 73, 269 
undamped 114 
Osculating circle 254 
Osculating plane 252 
Ostrogradsky’s theorem 630, 782 
Overflow 771, 789 
Overtone 697 


Parabola 62 
axis of 62 
cubic 64 
quadratic 62 
safety 373 
semicubical 65 
vertex of 62 

Parabolic point of a surface 383 
Paraboloid 

elliptic 326 
hyperbolic 327 
of revolution 316, 326 
Parallelogram law for addition of 
vectors 213 
Parameter 27 
Parametric representation 
of a curve 87 
of a function 87, 318 
of a surface 317 
Parseval relation (theorem) 
for Fourier series 704 
for Fourier transform 719 
Partial 

derivative 294 

of a composite function 298 
difference 303 
quotient 304 
differential 294, 303 
differential equation 498 
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Partial 

increment 294, 296 
rational fraction 278, 280 
Period of a function 50, 51 
Perturbed solution 189 
Phase 

of a harmonic 72 
space 525 
trajectory 525 

Planar point, of a surface 384 
Planimeter 450 

Point of discontinuity of a function 
49, 126, 127 

Point of inflection 64, 173 
Poisson 

equation 784 
law 735 
Polar 

angle 81 
coordinates 81 
radius 81 

Polynomial 52, 271S 
Potential of a force 456 
Practically impossible event 726, 
731 

Preimage (original, or inverse image) 
340 

Primitive period 51 
Principal 

normal 252 

value of a divergent integral 473 
value of an inverse trigonometric 
function 74 
Probability 723 
a posteriori 730 
a priori 730 
conditional 727 
distribution 733 
conditional 746 
integral 737 
properties of 725-727 
Problem of two bodies 104 
Program (for a computer) 44, 7675 
Programming 7645, 7715 
Projection of a vector 219 
Proper integral 454 
Proper rational fraction 277 
Properly divergent series 645 
Properties 

of continuous functions 129-131 
of definite integral 426-431 
of derivatives 139-142 
of indefinite integral 397-399 
of infinitesimals 111, 112, 122 
of limits 115-117 
Pseudoscalar 234 
Pseudovector 234 


Punch 

card 762 
tape 767 

Pythagoras’ theorem 79, 224, 707 


Quantity 25 
constant 26 
dimension of 25 
dimensionless 25 
variable 26 
Quadrants 79 
Quadratic form 355 
matrix of 356 
negative definite 379 
positive definite 379 
reduction to a diagonal form 
of 357 

Quadratic function 53, 62 


Radioactive decay 507 
Radius of convergence of a power 
series 667, 676 
Radius-vector 222 
Random 

event(s) 721 

certain (sure) 725 
impossible 725 • 
independent 728 
mutually exclusive 727 
opposite (contrary) 726 
practically impossible 726, 
731 

variable 732 

continuous 732 
discrete 732 
multidimensional 738 
normalized 751 
uniformly distributed 736 
vector 739 
Range 

of a function 48 
of a variable 30 

Rationalization of integrals 407 
Real part of a complex number 259 
Real time simulation 761 
Reducing the order of a difierential 
equation 5195 

Reduction of a higher-order difieren- 
tial equation to a system of first- 
order equations 521, *522 
Regression 
curve 747 
function 747 
linear 748, 756 
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Relative frequency 722 
Remainder 120, 162, 165 
Resolution of a vector along given 
vectors (axes) 217, 218 
Resonance 548 
Riccati’s equation 506, 563 
Riemann zeta function 656 
Right-hand screw rule 227 
Root 

multiple 271 
of a function 272 
of a polynomial 271 
simple 272 

Rotation (curl) of a vector field 636, 
642 

Roulette 89, 258 

Rounding 34 

Routine (program) 767 

Rule of a reserve decimal digit 35 

Saddle point 515 
Sampling 721, 729 

with replacement 729 
without replacement 729 
Scalar 212 

product of functions 706 
product of vectors 220, 221, 244 
expression in Cartesian coor- 
dinates of 224 
properties of 221. 222 
square of a vector 222 
Scale 

factor 62 
logarithmic 28 
non-uniform 28 
uniform 27 

Screw line (circular helix) 249 
Sequence 48 
Serial storage access 769 
Series 

alternating 650 

application to solving differen- 
tial equations of 564-572 
convergent 118 

absolutely 650, 658, 660 
conditionally 650 
divergent 119, 645, 652 
functional 661 ff 

uniform convergence of 663 
general term of 118 
harmonic 649 

in orthogonal functions 689G 
multiple 659ff 
numerical 117 
operations on 652-654 
oscillating divergent 652 
partial sum of 118, 645 


Series 

positive 645 
power 666-677 
properly divergent 645 

rearranging the terms of 653, 
654 

remainder of 120 
sum of 118 

Sign of double substitution 425 
Sign of identity 50 
Similarity coefficient 86 
Similarity transformation 86 
Simplifying general equation of an 
algebraic surface of the second 
order 327, 328 
Simply connected domain 487 
Simpson’s formula 452, 453, 649 
Sine integral 449 
Single-valued function 48 
Singular 

curve of a differential equation 
512 

integral curve 514 
lines on a surface 384 
point 

of a curve 372 
of a differential equation 
512, 524, 566 
of a surface 371, 384 
solution of a differential equation 
500, 514 
Sink 629 
Sinusoid 72 
Slide rule 28, 29 
Sliding vector 213 
Slope of a straight line 60 
Source 629 
Space 

Euclidean 244 
Hilbert 707 
linear 237 

n-dimensional Cartesian 239 
of events 310 
topological 310 
Specific heat 782 
Spectral density 715 
Spectrum 

continuous 713 
discrete 713 
of a problem 538 
Spheroid 323 
Spinode 65 

Standard deviation 745 
Standard methods of integration 404- 
415 

Standing wave 787 
State space 525 
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Itationary (critical) point 376 
itationary (critical) value 166, 173 
;tep of a table 43 
itieltjes integral 621 
.tiffness factor 498 
itokes’ theorem 640 
itorage system, parallel 769 
ubroutine (subprogram) 768 
ubsidiary conditions (for a condi- 
tional extremum) 384 
ubspace (submanifold) 239, 312 
ubstitution 

hyperbolic 409 
trigonometric 408 
ubtraction 

of approximate numbers 34, 36 
of vectors 215 

ufficient conditions for an extremum- 
167, 168, 377-379 
ummation sign 118 
um of vectors 213, 214 
uperposition principle 493, 531, 551 
urface 

conic 315 
cylindrical 315 
integral 6025 
of revolution 315 

ymmetric form of a system of diffe- 
rential equations 523 
ystem 

of first-order approximation 560 
of first-order difierential equations . 
522-526 

of linear algebraic equations 
206-211 

homogeneous 211 
of linear difierential equations 
5535 

abular integrals 396 
abular method of representing func- 
tions 43 
angent curve 73 

angent plane to a surface 368, 369 
angential acceleration 252 
aylor’s 

formula 161, 374, 375 
series 163 
erm-by-term 

differentiation of a series 666, 668 
integration of a series 665, 668 
est for distinguishing multiple roots 

& i 

tieory of probability 721 
orricelli’s law 438 
Dtal 

derivative 368 


Total 

difierential 296, 297, 305 
increment 296 
Trajectory of motion 87 
Transcendental curves 90 
Transcendental surfaces 314 
Transfer instruction 771, 775 
Transformation of a matrix 
of a mapping 3475 
of a quadratic form 356 
Transient process 128, 559 
Translation of a vector 213 
Transposing a determinant 203 
Trapezoid rule 451 
Trial 721 

Triangle inequality 245 
Trigger 768 

Trigonometric -form of a complex 
number 260 

Trigonometric series 686-690 
Triple 

scalar product of vectors 235 
vector product of vectors 236 
Trivial solution of a homogeneous 
system 335 

Two-state element 768 
Two-way series 702 


Umbilical point of a surface 382 
Uncertainty principle 718 
Unconditional transfer of control 775 
Undetermined coefficients, method of 
2785, 545. 565, 566 
Unfavourable outcome (case) 724 
Unilateral (one-sided or non-restrict- 
ing) constraints 388 
Unit vector 221 
Unperturbed solution 189 
Upper bound 30, 31 

Variable 26, 248 
bounded 31 
bounded above 31 
bounded below 31 
continuous 29 
dependent 39 
discrete 29 
independent 39 
infinitely large 112 
infinitesimal 109 
monotonic 30 
oscillating 114 
point 28 
random 732 
range of 30 
vector 248 
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Variance (dispersion) 744 
Variation 572 
Variational equation 573 
Vector 212, 218 

component of 217 
coordinates of 219 
field 367, 626f£ 

function of a scalar argument 248 

length of 212 

line 626 

modulus of 212 

moment of 233 

negative of 215 

of an area 228 

origin of 212 

product of vectors 228 

expression in Cartesian coor- 
dinates of 232 
properties of 230, 231 
projection of 219, 220 
solution of a system of differen- 
tial equations 557 
terminus of 212 
Vectorial angle 81 

Vector-parametric equation of a curve 
249 


Velocity 

average 134 
convective 368 
instantaneous 135 
Vertex of a curve 254, 257 
Volume of a solid 445H 


Wave equation 784 
Wave length 73 
Wave number 73 

Weierstrass’ test for uniform conver- 
gence of a functional series 663 


Young’s modulus 538, 780 


Zero(s) 

line of 290 

of a function 133, 272 
of a polynomial 271 ff 
Zero-dimensional space 241 
Zero matrix 330 
Zero vector 215 
Zeta function 650 



List of Symbols 


Constants 


e 68; p n 677; B n 678; 156, 165; C 650. 


Functions ( general notation ) 

/ (x) 40; z/|.-t=a40; / (x, y) 41; z|x=a41, 


/( <P(*))42; y(x) 42. 


Functions ( special notation) 
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